Failing to achieve 2 folds improvement of the peft lora example in week2 lab

spartanlq · April 25, 2025, 1:45pm

Hello,
I am trying to repeat the lab 2 fine-tuning exercise using peft approach on my own Mac mini. I am using flan-T5-small so that the model can fit in the 16G memory. I increased max_step to 10000 and I increased no of epochs to 5. The training finished after 24 hours. However, I only observed 0% - 3% improvement of the rouge scores by the peft model against the original model. Could anybody tell me what I can do to obtain the similar level of improvement as demonstrated in the Lab 2 vedio ?

Here are my peft configurations and training configurations:

lora_config = LoraConfig(r=32, lora_alpha=32, target_modules=[‘q’, ‘v’, ‘k’, ‘o’],
lora_dropout=0.05, bias=‘lora_only’,task_type=TaskType.SEQ_2_SEQ_LM)

peft_training_args=TrainingArguments(
output_dir=output_dir,
learning_rate=1e-4, # Better LR for small model
num_train_epochs=5, # Proper epoch-based training
max_steps=10000,
logging_steps=100,
report_to=“tensorboard”,
)

Thanks a lot!
Quan

When creating a post, please add:

Week # must be added in the tags option of the post.
Link to the classroom item you are referring to:
Description (include relevant info but please do not post solution code or your entire notebook):

Igor_Pereverzev · April 27, 2025, 7:06am

Hello,

Could you please tell me exactly what improvements you are aiming for and why you need them? 3% improvement is normal not perfect, but still decent. If you are looking for significant improvements on your personal Mac, then you would need different models and a larger number of parameters.

spartanlq · April 27, 2025, 2:34pm

Thanks Igor for the reply! For learning purpose, I am trying to repeat the 2 folds improvements demonstrated in the lab 2 Vedio. Can you please tell me what I should do ? (use larger model? use more training dataset ? )

Igor_Pereverzev · April 29, 2025, 6:25am

sure it’s rule:

Underfitting → make the model more complex.
Overfitting → increase the dataset size.

Girijesh · April 29, 2025, 6:53am

Dear @spartanlq,

Try this following quick fixes.

Remove max_steps if you’re setting num_train_epochs.
Double-check target_modules using model.named_modules().
Lower the learning rate to 3e-5( reducing it to 5e-5 or 3e-5 and observe the difference).
Validate your dataset is appropriate and large enough.
Add checkpoints or eval steps to monitor real-time improvement.
Evaluate using multiple metrics and/or manual inspection.

Let me know if you have any questions.

Topic		Replies	Views
W2 lab lora config and training parameters offline model Generative AI with Large Language Models week-module-2	2	383	October 31, 2023
PEFT-LoRA: model performance Generative AI with Large Language Models week-module-2	0	467	October 3, 2023
Week 2 Lab: Training configuration of the PEFT model Generative AI with Large Language Models ai-discussions	3	63	November 21, 2024
PEFT fine-tuning on Flan-t5-base model does not change inference results Generative AI with Large Language Models week-module-2	2	369	March 14, 2024
Lab3 peft/lora Generative AI with Large Language Models week-module-3	1	435	September 16, 2023

Failing to achieve 2 folds improvement of the peft lora example in week2 lab

Related topics