May I just check my understanding here?
Step 2.1 in the Lab notebook contains a step where we take example data to generate multiple prompts for the LLM, but we’re not giving any examples of what a good summary looks like:
2.1 - Preprocess the Dialog-Summary Dataset
You need to convert the dialog-summary (prompt-response) pairs into explicit instructions for the LLM. Prepend an instruction to the start of the dialog with
Summarize the following conversation
and to the start of the summary withSummary
as follows:
Training prompt (dialogue):
Summarize the following conversation. Chris: This is his part of the conversation. Antje: This is her part of the conversation. Summary:
Training response (summary):
Both Chris and Antje participated in the conversation.
Later on we can see that the results of such training are significant, but what’s slightly blowing my mind is that this seems to basically be unsupervised training, yep? In contrast to in-context learning (which if sort of a form of supervision) we’re just giving it N prompts and getting N completions based on no examples of what a good completion looks like. And yet it works well. Am I understanding this correctly? If so, wow.