Models cofigurations for RLHF

I have a doubt about the lab of week 3. I am not sure what the role of AutoModelSeq2SeqLMWithValueHead is in RLHF. Is it the model to train, or the instructor (used in KL divergence to avoid reward hacking), or the classifier (which I don’t think so, since there is AutoModelForClassification)? And I don’t know why it has a Value Head. What is its role?

And here ref_model = create_reference_model(ppo_model) shouldn’t be ref_model = create_reference_model(peft_model)?, since I thought that the reference model is the model without the new LoRA matrix and just the fine tuned in lab 2 with LoRA.


1 Like

That seems to be right.

1 Like