Can LLM like Code Llama learn project codebase?

GitLab had built merge request summary system based on AI. That system seems to take only diff and comments when producing output. The rest is devised from internal AI weights. In other words - NN doesn’t read the whole source code of the project to get deeper understanding of changes made to it.

So the question is, is it possible for make LLM read and learn the whole project codebase, and give exact 0 temperature answers about what new changes to the code are doing?

Is it possible to update LLM knowledge of the codebase when new commits are merged?

Is it possible for LLM to warn about potential bad practices and problems with the diff? Like when the code changed behavior in parts of the codebase that are invisible in the diff.

1 Like

Wouldn’t this require knowledge of external project dependencies as well?

1 Like

Dependencies usually have well documented API, and that should be enough to understand what is the expected behavior from calling them.

What are the results with your suggested approach?

I don’t have sufficient understanding of LLM to get to the results.

How about mentioning your idea on the gitlab thread?

The thread mentions that GitLab uses PaLM 2 (text-bison) model from Google’s Vertex AI. VertexAI docs say that models can be “tuned” for specific use cases using input-output examples. But I don’t understand how to represent 1Gb of source code as the input-output dataset.

Please see this short course.

1 Like

Thanks. I actually started it two months ago, but because the learning platform videos don’t play in the browser I couldn’t progress far.

Does this help?

I download videos and then open them with vlc, but then I lose the ability to watch subtitles.

Sorry I don’t know how to download subtitles for the video. Can’t you save the text from the transcript section to your machine as a text file and use it?

I can right click, inspect element, expand <video> tag, then click the link to .vtt file to download it next to video file. Then vlc opens video with subtitles. But that’s an underworld of user experience. If DLAI platform was open source, I would already fix it.

I’ve notified the staff regarding your ask to provide a link / button to download subtitles. Let’s wait and see.

The better fix is to just encode the content as AV1 videos. Less bandwidth, and improved UX. I’ve sent my resume to Careers - DeepLearning.AI to fastlane the fix.