What dataset to use for the fine tuning of a pretrained llm?

I’ve passed your course and still not have clear answer on a question. We have a lot of pretrained llms like Llama or Mistral. And yes, they can give not correct answers on my tasks via prompt.

So, the question is — how can I finetune this models? As I understand they were trained on their own datasets. If I want to add more info to it and train model on this info, will the model be trained from zero? Will it forget all the data on that it was initially trained?

Or if I want to save initial state of answers too, should I find a dataset on which model was trained and try just to add my new dataset to it?

You can check the short course that explains how to fine-tune LLMs.

1 Like

Oh, nice! So, as I understnand, if I use .train() on pretrained model, in most part of libraries, it doesn’t overwrite previous knowldege (weights), but just have influence on them and add additional knowledge. It is interesting how much of data should I input to greatly change initial behaviour of pretrained model? :thinking:

And, so, as I understand, 1 llm is all I need for all tasks, if I want to use llm anyway for one of tasks? (no need to create smaller models for separate NLP tasks). The problem is just in its additional training for each of the tasks? So, it is something like T5, that you mention in the course? Universal one model for all tasks?

Yes, using .train() on a pre-trained model in most libraries doesn’t overwrite the existing weights but rather fine-tunes them with the new data (NOTE: You can freeze some weights in some places e.g. feature extractor, etc. to keep their performance foe general-purpose tasks).

The amount of data needed to change the model’s behavior depends on the task and the size of the dataset, but generally, a set of diverse data is required.

LLMs like T5 can handle various NLP tasks with a single model so, you don’t need to create separate models for each task. However, fine-tuning is often necessary!

Hope this helps!

1 Like

I’ve just started to watch https://www.coursera.org/learn/generative-ai-with-llms. I think this is most usefull NLP course in deeplearning.ai. I’am sure that it should be included to NLP Specialization.

1 Like

Hey there,

Good to hear that you liked the course and found it practical!

@Alireza_Saei and @someone555777, I am looking to start my next course, and I was wondering about the NLP Specialization. Then, I read your comments on the Generative AI with LLMs course, which seems good.
What would you say if you had to choose which one to do first?

1 Like

Hey there @fabricio

The NLP Specialization starts from the basics of NLP, covering n-grams, POS tagging, RNNs, attention models, transformers, etc. It’s excellent if you have no prior knowledge and want a solid foundation. However, if you’re eager to dive into LLMs field and start projects quickly, the Generative AI with LLMs course might be the better choice.

If you’re not in a rush, starting with the NLP Specialization and then switching to the Generative AI with LLMs course can be a great path.

Hope this helps! If you need further assistance feel free to ask. And good luck with your learning journey! :raised_hands:

1 Like


Your comment is very helpful. Thanks.
I am not in a rush. I like to learn all the fundamentals first and then build more advanced knowledge on a solid foundation.

I think I’ll follow your suggested learning path, sir.
Once again, thank you!

You’re welcome! I’m glad you found my comment useful. I wish you the best of luck on your learning journey! :raised_hands:

1 Like

But you should know also, that 80% of NLP Specialization is not connected with LLM at all. It is something about how to do specific tasks by simple ways. That are more hardware efficient and has quicker response. But if you will need to use LLM, all this approaches are often a bit non-sense


Thanks. Yes, I understand that LLMs are just a tiny fraction of what NLP entails. I am interested in learning everything.
LLMs are great, but I want to go beyond.