Is RLHF like fine-tuning a classification model at the end of each generation?

Paramdeep · July 6, 2023, 12:09pm

The RLHF is essentially selecting an output from a prompt (from multiple available options). Can this be treated like a classification model at the end of each prompt generation?

Just like we had prompt fine-tuning, is it possible to update the model with this classification model?

Atharva_Divekar · July 8, 2023, 2:43pm

Yes, the RLHF can be treated like a classification model at the end of each prompt generation. It is possible to update the model with this classification model just like we had prompt fine-tuning.

Topic		Replies	Views
RLHF... How? Generative AI for Everyone week-module-2	2	508	December 5, 2023
Quiz - week3 - RLHF reward hacking - end of video quiz - interpretability Generative AI with Large Language Models week-module-3	7	373	January 18, 2024
Week 3 general question Generative AI with Large Language Models	3	62	December 1, 2024
Question about reward model in RLHF Generative AI with Large Language Models week-module-3	7	569	January 7, 2024
Generative AI with Large Language Models: Week 3 Generative AI with Large Language Models llm , generative-ai	1	41	November 20, 2024

Is RLHF like fine-tuning a classification model at the end of each generation?

Related topics