Question on the loss function of reward model

Moli_Yang · July 13, 2024, 2:11pm

Hi,
I’m taking week 3 lecture " [RLHF: Obtaining feedback from humans](https://www.coursera.org/learn/generative-ai-with-llms/lecture/lQBGW/rlhf-obtaining-feedback-from-humans#".

In the lecture, teacher says that the loss function of reward model is log(sigmoid(r_j-r_k)). But i think the loss function should be -log(sigmoid(r_j-r_k)) when preferred completion is y_j. Because sigmoid function should be closer to 1 when r_j - r_k > 0 and then if we want to minimize the log function we should add minus sign, which should be -log(sigmoid(r_j-r_k)).

I don’t know if this understanding is correct, or if there’s an alternative explanation for the formulation of the loss function as presented in the lecture?

gent.spah · July 15, 2024, 10:32am

Hello, I watched the video and couldn’t find where is she giving this formula?

Maybe this post can help you!

Topic		Replies	Views
Why Log Sigmoid log(σ(r_j - r_k)) as loss function to train reward model? GenAI with LLMs Resources	12	671	September 26, 2024
Is it a typo in the loss function of Reward model in Week3? Generative AI with Large Language Models week-3	3	398	September 15, 2023
W3 - RLHF Reward Model - loss of reward model Generative AI with Large Language Models week-3	1	357	October 1, 2023
I have a question about the content of the lecture Generative AI with Large Language Models week-3	0	401	August 14, 2023
There might be one error regarding the loss function in the slice on page 21 Generative AI with Large Language Models week-3	1	413	August 8, 2023

Question on the loss function of reward model

Related topics