Enhancing GitHub Code Review with the OpenAI Assistant

Mohd Shamoon

NOVEMBER 26, 2023

Share

During my development work on a new feature for Glific, I found myself contemplating potential optimizations in my code. Some functions seemed a bit lengthy, prompting me to explore ways to shorten them. In a moment of curiosity, I pasted my code into ChatGPT and sought advice on possible optimizations. ChatGPT provided valuable tips on refactoring.

Recognizing the potential benefits of leveraging AI for code review, I explored existing solutions and stumbled upon a GitHub action specifically designed for this purpose: ai-codereviewer. This action, powered by the OpenAI API, promised to review code automatically.

Upon inspecting the implementation, I discovered a straightforward approach:

  1. Utilize GitHub Octokit APIs to fetch pull request (PR) details.
  2. Obtain the diff from the previous commit in the PR, representing an array of diff change objects.
  3. Format each diff into a string suitable for input to the OpenAI API.
  4. Create a prompt for OpenAI, instructing it to review the code based on the provided context.

The output from this process was a JSON array in the required format for GitHub’s review comment API. Applying this to my PR resulted in two meaningful comments from the GitHub action.


However, the approach of reviewing each commit independently did not align with my preference. I aimed for a comprehensive review of the entire PR when my work was complete.

To address this, I tackled two challenges:

  1. Triggering the Action: I learned that GitHub actions can be triggered by adding a label. I configured the action to run when the label “review” was added to the PR.
  2. Reviewing the Whole PR: To review the entire PR, I forked the code reviewer action and modified it to analyze the diff of the entire PR instead of individual commits. Unfortunately, this change led to a rate limit issue, with 429 too many requests.

To overcome this hurdle, I considered adding delays between requests. However, given the potential volume of API calls and the associated time and cost implications, this approach was impractical. A closer examination revealed that the action was making repeated calls to the OpenAI API for each diff.

The action was using the completions API which was now in legacy so I thought to explore more APIs on the platform.

Thankfully, my exploration of OpenAI’s offerings led me to the Assistant API, currently in beta. This API allows for the creation of an assistant that responds to instructions. The process involves defining a name, instructions, and the model to be used.

The Assistant API provides a powerful feature where a thread of messages can be executed in a single API call. I organized all the diff changes as messages in a thread and ran it using the assistant. The output was obtained in the desired format, albeit with the need for polling to retrieve the results upon completion.

However, I encountered and addressed some issues during experimentation:

  • GitHub Review Comment Format: The API’s line numbers did not always correspond to the diff change lines, leading to API failures. I adjusted the output by removing extraneous JSON text and ensuring accurate line numbers.
  • Contextual Feedback: To enhance the feedback, I experimented with various prompts and eventually achieved more insightful results. The PR review demonstrated the assistant’s helpful suggestions, encouraging thoughtful consideration of code maintainability and decision-making.

Considering the cost aspect, running the Assistant API for a 10-file change PR cost approximately $0.07. However, as I ran it multiple times, costs accrued.

Several considerations for future improvements and optimizations include:

  • Context Token Optimization: Explore ways to reduce context tokens, especially since the git diff includes additional lines beyond the changed lines.
  • Prompt Enhancement: Modify prompts to offer more contextual feedback related to the project, potentially incorporating design guidelines.

While the Assistant API shows promise, it’s worth noting that GitHub is also developing its own advanced solution called Copilot for Pull Requests. The continuous evolution of tools integrating AI into the development workflow is exciting, offering developers innovative ways to enhance code quality and collaboration.

Leave a Reply

Discover more from Glific

Subscribe now to keep reading and get access to the full archive.

Continue reading