Hi,

I’m taking week 3 lecture " [RLHF: Obtaining feedback from humans](https://www.coursera.org/learn/generative-ai-with-llms/lecture/lQBGW/rlhf-obtaining-feedback-from-humans#".

In the lecture, teacher says that the loss function of reward model is log(sigmoid(r_j-r_k)). But i think the loss function should be -log(sigmoid(r_j-r_k)) when preferred completion is y_j. Because sigmoid function should be closer to 1 when r_j - r_k > 0 and then if we want to minimize the log function we should add minus sign, which should be -log(sigmoid(r_j-r_k)).

I don’t know if this understanding is correct, or if there’s an alternative explanation for the formulation of the loss function as presented in the lecture?