PEFT training

baj · January 16, 2024, 12:34am

This is definitely a very exciting course!! As someone who is familiar with the classical ML, and new to Transformers world, I definitely learn a lot!

In the lab, we were shown how to train a using fine tuning and PEFT. However, the training is limited to only 1 epoch/max_step=1. There is no explanation on the reasoning behind how the training parameters are chosen. (e.g. learning_rate, weight_decay). There are a tons of other parameters and optimizer as well in the Trainer class from hugging face.

When fine tuning or performing PEFT, is there a general guideline on what parameters should be used and at what values they should be used at? For example, the learning rate is set to a certain value in the lab, and so is the weight_decay. In transformer world, is this generally enough? or should we do experiments just like clasical ML to find the optimum parameter for these values?

Can anyone share any tips, references on how to set these training parameter for fine tuning and PEFT training?

gent.spah · January 16, 2024, 5:54am

My thinking is to use similar parameters in similar scenarios, they also went through some study of trial and error to find them.

Generally speaking ML choice of hyperparameters is a trial and error process!

Topic		Replies	Views
Parameters for Fine-Tuning locally Generative AI with Large Language Models week-2	1	458	July 12, 2023
Week2: training args for offline models Generative AI with Large Language Models week-2	5	439	August 18, 2023
Lab2 - PEFT fine tuning question Generative AI with Large Language Models week-2	1	271	December 8, 2023
Fine-Tuning a Large Language Models with QLoRA and PEFT(LLMs) Generative AI with Large Language Models week-2	11	727	July 15, 2023
Hyper-parameters of that downloaded instruct_model Generative AI with Large Language Models week-2	4	407	August 4, 2023

PEFT training

Related topics