PEFT training

This is definitely a very exciting course!! As someone who is familiar with the classical ML, and new to Transformers world, I definitely learn a lot!

In the lab, we were shown how to train a using fine tuning and PEFT. However, the training is limited to only 1 epoch/max_step=1. There is no explanation on the reasoning behind how the training parameters are chosen. (e.g. learning_rate, weight_decay). There are a tons of other parameters and optimizer as well in the Trainer class from hugging face.

When fine tuning or performing PEFT, is there a general guideline on what parameters should be used and at what values they should be used at? For example, the learning rate is set to a certain value in the lab, and so is the weight_decay. In transformer world, is this generally enough? or should we do experiments just like clasical ML to find the optimum parameter for these values?

Can anyone share any tips, references on how to set these training parameter for fine tuning and PEFT training?

My thinking is to use similar parameters in similar scenarios, they also went through some study of trial and error to find them.

Generally speaking ML choice of hyperparameters is a trial and error process!