Some questions about LLM Training

gazhuchao · March 5, 2024, 3:21pm

Hi, I want to train an open-source model to automate code generation for a programming language. But I have some questions about LLM Training:

how to choose an appropriate open-source model?
How to collect data? How to normalize the collected data?
how to use these data to train the open-source model?
how to fine-tune the training to achieve the best results?
How to apply the trained model in real scenarios?

Could you please help me answer those questions? Or provide some learning materials that I can refer to.
Any input will be appreciated!

TMosh · March 5, 2024, 7:25pm

Meta spent about a gazillion dollars researching this, for the coding tools in their Llama 2 project.

It’s quite a lot to handle doing this by yourself.

Maybe explore the state of the art a little by attending the Llama 2 short course.

gazhuchao · March 6, 2024, 5:24am

Instead of a large model trained on massive amounts of data, I wanted a small, proprietary model (less than ten billion) trained only on the rules and paradigms of a programming language that already had a web site with instructions and code files for related paradigms. Through this learning task, I hope to learn how to help companies train their own specialized models with vertical professional depth.

Is there anything wrong with my idea? I hope to get your guidance.

TMosh · March 6, 2024, 7:08am

Sorry, I don’t have any specific guidance in this area.

Topic		Replies	Views
How CodeLLMs are trained AI Discussions ai-discussions	1	111	April 23, 2024
Training an LLM AI Discussions ai-discussions	3	327	April 24, 2024
Map a Problem to an LLM Model? Generative AI with Large Language Models week-module-1	4	640	July 2, 2023
[Training LLM] blog post AI Discussions	2	125	November 5, 2023
Seeking advice on open-source llm selection AI Discussions ai-discussions , llm , project	1	217	April 17, 2024

Some questions about LLM Training

Related topics