PEFT-LoRA: model performance

Using Lab2 notebook I tried to experiment with some fine-tuning with LoRA (on GPU) but got much worse ROUGE scores than even the original model. I wonder why.

I was using following training arguments, with ~1250 question-answer pairs for training (10x more than the quick notebook training example):

peft_training_args = TrainingArguments(

My thoughs are: The lower rank matrices are initialized with random values, so if I don’t train long enough, the addition of those adapters might just add more randomness than knowledge to my model. Any thoughts if I am on the right path here?

If I am right, how should I train (how many epochs? full dataset of 12k pairs?) to achieve similar results as the PEFT-LoRA checkpoint we’re using in the Lab?