Using Lab2 notebook I tried to experiment with some fine-tuning with LoRA (on GPU) but got much worse ROUGE scores than even the original model. I wonder why.
I was using following training arguments, with ~1250 question-answer pairs for training (10x more than the quick notebook training example):
peft_training_args = TrainingArguments(
    output_dir=output_dir,
    auto_find_batch_size=True,
    learning_rate=1e-3,
    num_train_epochs=1,
    logging_steps=1,
    max_steps=-1    
)
My thoughs are: The lower rank matrices are initialized with random values, so if I don’t train long enough, the addition of those adapters might just add more randomness than knowledge to my model. Any thoughts if I am on the right path here?
If I am right, how should I train (how many epochs? full dataset of 12k pairs?) to achieve similar results as the PEFT-LoRA checkpoint we’re using in the Lab?