Generative AI with Large Language Models: Week 3
RLHF: Obtaining feedback from humans
We have prompt dataset, and several prompt samples that are passed to the instruct LLM to obtain completions.
Also we have several human labelers to provide feedback.
I do not quite understand the figure.
In the example we had one prompt and three likely prompt completions.
Prompt: “My house is too hot”
completion 1: “There is nothing you can do about hot houses”.
completion 2: “You can cool your house with air conditioning”.
completion 3: “It is not too hot”.
Does it mean that the same input “My house is too hot” was provided to three different prompts to the same instruct LLM to return the 3 different completions?
Would these two scenarios be different?
- same input on the prompt is passed to the same instruct LLM thrice, three different completions are obtained
- three different prompts (same inputs) passed to the same instruct LLM once each, three different completions are obtained
What I understand is that there are k different prompts each with the same input and so, we have k different outputs,
and we have multiple human labelers, say, L. So each input gets k*L number of ratings which is then aggregated (averaged) to rank each of the prompts.
Is this correct?