Does reward model need retraining with domain specific inputs?

saileshbaidya · November 5, 2023, 7:46pm

I think the reward model has to be pretrained with the set of {{completions1, completions2}, {human label 1, human label2}}
, right?

Jamal022 · November 5, 2023, 8:03pm

Hey @saileshbaidya,

Yes, in reinforcement learning and imitation learning, a reward model is often pretrained using a dataset of completions and corresponding human labels to guide the model’s behavior effectively.

Cheers!
Jamal

saileshbaidya · November 5, 2023, 8:12pm

Awesome, thanks!

Topic		Replies	Views
Week 3: Video RLHF Reward Model Generative AI with Large Language Models week-3	0	316	November 18, 2023
Week3 lab, the part given to the reward model using human feedback Generative AI with Large Language Models week-3 , faq	18	266	June 4, 2024
Week 3 general question Generative AI with Large Language Models	3	43	December 1, 2024
Sample-Efficient Training for Robots AI Discussions the-batch , ai-discussions	0	83	July 14, 2023
Question about reward model in RLHF Generative AI with Large Language Models week-3	7	439	January 7, 2024

Does reward model need retraining with domain specific inputs?

Related topics