PEFT fine-tuning on Flan-t5-base model does not change inference results

malsadi · March 13, 2024, 5:15pm

Hello All,

I followed the same guidance on how to fine-tune a base model for summarisation task using LoRA technique. I applied the same code but the the final results have not changed. I’m not sure why this is happened, given that the model, dataset and LoRA configurations are the same. The only difference is that I have all steps executed on my local environment. I have not used the files from checkpoints.
The only thing which I don’t feel correct is that after training the peft_model on the tokenized assets the logs shows learning_rate is 0.

#load the foundation model from disk
model_local_path = 'C:\\Users\\ms\\.cache\\huggingface\\hub\\models--google--flan-t5-base\\snapshots\\7bcac572ce56db69c1ea7c8af255c5d7c9672fc2\\'
foundation_model = AutoModelForSeq2SeqLM.from_pretrained(model_local_path, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_local_path)

gent.spah · March 14, 2024, 5:22am

I am not sure if there are other files needed to run the labs on your machine but may I say that these to functions/methods:

foundation_model = AutoModelForSeq2SeqLM.from_pretrained(model_local_path, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_local_path)

have actually more parameters that you might need to set.

Search about the in google and try to see if any of them has a learning rate parameter!

malsadi · March 14, 2024, 1:25pm

Many thanks for your reply,
I’m not sure what exactly what you are referring to. However, the model ( all the files listed in the attached picture ) was successfully uploaded to my disk and then I was able to load it and start interacting with it ( asking questions and getting answers )

Topic		Replies	Views
W2 lab lora config and training parameters offline model Generative AI with Large Language Models week-module-2	2	382	October 31, 2023
Lab 2 - What training parameters were used to full train the LoRA tuned model? Generative AI with Large Language Models week-module-2	2	76	June 25, 2024
PEFT-LoRA: model performance Generative AI with Large Language Models week-module-2	0	464	October 3, 2023
Question tokenizer PEFT training Generative AI with Large Language Models week-module-2	3	192	May 1, 2024
No executable batch size found Generative AI with Large Language Models week-module-2	0	434	November 1, 2023

PEFT fine-tuning on Flan-t5-base model does not change inference results

Related topics