Training of NLP models

Krish_code · September 24, 2024, 1:31am

Hi all,
Correct me if my understanding is wrong, in NLP the NN model predicts the next word when the context is given right. The NN is trained on Large corpus of data sets and during initial predictions we use teacher forcing method and when the model is ready ., the context is given lets say " I want a glass of orange {prediction word} " and the word is " Juice " is predicted. Am I right.

Alireza_Saei · September 24, 2024, 3:22am

Hi @Krish_code

Yes. In NNs like language models predict the next word based on a given context (during training, a teacher forcing is used to have a effective training process). Once trained, the model predicts the next word based on the input context.

In your example, the context is “I want a glass of orange” and the model would predict “juice” as the next word if it has learned the relation between “orange” and “juice” from the training data.

Hope it helps! Feel free to ask if you need further assistance.

Krish_code · September 24, 2024, 6:44pm

Thank you for responding to my message. further more I am diving into Transformers and LLM architectures., need someone to clear my doubts. Will definitely approach you in near future. Thanking you once again.

Alireza_Saei · September 25, 2024, 7:21am

You’re welcome! Sure, feel free to reach out whenever you need help.

Nevermnd · September 25, 2024, 8:08am

@Krish_code just worth noting, as I found this confusing when first studying, but your ‘teacher forcing’ is your one of off shift. Or it is always one step ahead.

Also if you are really serious about LLMs you should check this: Neural Networks: Zero To Hero

I mean NLP course is great, it fills in a lot of gaps, so I see them as compliments.

Topic		Replies	Views
LLM Paper - Knowledge AI Discussions ai-discussions , large-language-model	6	271	February 18, 2024
RNNs predicting next word... ( Generative AI with Large Language Models ai-discussions , introductions	8	42	February 4, 2025
Video: NMT Model with Attention NLP with Attention Models week-module-1	5	402	December 21, 2023
How does next word prediction work for language translation? Generative AI with Large Language Models week-module-1	1	789	July 20, 2023
Question on how Base LLMs are trained Generative AI with Large Language Models week-module-2	4	455	August 3, 2023

Training of NLP models

Related topics