When creating a post, please add:
- Week # must be added in the tags option of the post.
- Link to the classroom item you are referring to: https://www.coursera.org/learn/generative-ai-with-llms/lecture/yPzI4/lab-3-walkthrough
- Description (include relevant info but please do not post solution code or your entire notebook):
Hello, I am taking week3 lab.
I am working in a lab that practices RLHF. Is there a part where human feedback is delivered to the reward model?
From what I heard in the lecture, I understood that Human feedback was received, passed on to the reward model, and then the reward model learned it. But I don’t know where this part is in the lab.
Or, just use something called scaling-human-feedback in this exercise?