How do we practically perform fine tuning & pretraining on an existing LLM model

The lecture talks about an overview of how to fine tune and pretrain LLM models. But, how to achieve it practically? It would be good to know it from a developer’s perspective. In other words, how do I supply the smaller data set of 1000s of specific sample(in fine tuning for example) to say GPT-4o ?

Typically:

  • You obtain the full set of weights and architecture for the model.
  • You freeze all of the weights in the model, except for the output layer.
  • Then you train just the output layer weights using your specific additional examples.

The Deep Learning Specialization has an exercise with an example of this method.

Thanks. I am not so familiar with ML and learning Generative AI application development. So, would adding smaller data sets to the model be exposed through the API (like OpenAI API) associated with the model or is there some other way to achieve it? And can closed source models be also fine tuned this way?

Sorry, I cannot answer about how the API for OpenAI works.