Training an LLM

sujithanumala · April 23, 2024, 12:09pm

Hello
I need some help in understanding how exactly an LLM is trained from scratch. I just need to know what kind of inputs and what kind of outputs are given to LLM while training. Is the LLM trained auto regressively for every task like language understanding, code generation, reasoning, etc. How do they use COT , PAL, ReAct techniques exactly while training.
Especially in coding tasks how to they ensure that the model generates reliable code.
Thanks

gent.spah · April 23, 2024, 1:46pm

You want to know exactly how many complex processes are performed and you want it in a post!

You should try some of our Specializations here and get some knowledge on the subjects.

As a starter an LLM is trained in the same way as you train an NLP model but with much more data and much bigger models. Accuracy is improved by continuously improving is accuracy with the data is being fed on!

sujithanumala · April 23, 2024, 2:30pm

Thanks for the response.
I have completed all of the courses you mentioned. My doubt is regarding Chain of thought. Is the model tasked to generate Chain of thoughts or whether the model is given chain of thoughts at input while training and then tasked to generate answer.
If the model is given COT at input , how the model generalizes at the time of inference (as we are not providing any COT at inference).
Thank you in advance .

gent.spah · April 24, 2024, 4:27am

I dont think the model is given a chain of thought in the training phase, but these LLMs use transformer architecture with attention that can be capable to focus on certain parts of the input and also can be trained in unsupervised mode!

Topic		Replies	Views
Chain of Thought, PAL, ReAct AI Discussions ai-discussions	2	253	April 23, 2024
How CodeLLMs are trained AI Discussions ai-discussions	1	103	April 23, 2024
Question on how Base LLMs are trained Generative AI with Large Language Models week-module-2	4	431	August 3, 2023
Some questions about LLM Training AI Discussions ai-discussions	3	173	March 6, 2024
LLM Model Development AI Discussions	3	199	January 8, 2024

Training an LLM

Related topics