In the second video of the week2 (Instruction fine-tuning) Mike describes the data prep tools for the LLM fine tuning. Large corpus of (say) amazon reviews are formatted as prompt, fed to LLM and the outputs are compared with the labels.
Question: Where the labels come from? For example, the amazon review “Godfather is great” shall be classified as positive, but where this ground truth come from? It is not present in the raw Amazon review…
What does the review say, maybe the ground truth is a short summary of the entire review done by humans!