Week # 2 | Lab 2 | Using the fine-tuned model

Javeria18 · August 6, 2025, 1:19pm

In lab 2, section Perform full fine-tuning, the lab instructions say " Training a fully fine-tuned version of the model would take a few hours on a GPU. To save time, download a checkpoint of the fully fine-tuned model to use in the rest of this notebook. This fully fine-tuned model will also be referred to as the instruct model in this lab."

But what if I want to use the model we fine tuned in the lab. The trainer.train() code executed within minutes somehow, and I tried the following code then to use the trained model:
original_model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)
instruct_model = AutoModelForSeq2SeqLM.from_pretrained(output_dir, torch_dtype=torch.bfloat16)

But the resulting instruct_model doesn’t work the same way as the downloaded instruct_model.

I’m confused. Why did “trainer.train()” take only few minutes when the instructions say it can take hours? And is the above code not the right way to use the resulting finetuned model? If it’s the right way, why is it not working as well as the downloaded instruct_model?

gent.spah · August 7, 2025, 9:57am

You are training on a smaller dataset and only for 1 epoch: huggingface_dataset_name = “knkarthick/dialogsum”. The entire fine-tuning consists of a bigger dataset and many more epochs, I would guess.

Here I think you are using the original model that I mentioned above, the instruct_model is a fully fine-tuned model given to you later on in the lab. The original model cannot perform as well as the instruct_model for the reasons above.

Topic		Replies	Views
Generative AI with Large Language Models fine tuning checkpoint Generative AI with Large Language Models week-module-2	3	189	April 19, 2024
W2 lab - testing fine tuned model from section 2.2 Generative AI with Large Language Models week-module-2	2	471	September 30, 2023
How to save fine tuned LLM AI Discussions week-module-2	1	182	January 3, 2024
Hyper-parameters of that downloaded instruct_model Generative AI with Large Language Models week-module-2	4	415	August 4, 2023
Week2-Lab2-How could I save trainer as checkpoint to use later Generative AI with Large Language Models ai-discussions	4	31	November 8, 2024

Week # 2 | Lab 2 | Using the fine-tuned model

Related topics