RLHF... How?

DPawson · December 5, 2023, 2:07pm

Andrew talks about ‘marking’ LLM output when fine tuning. How might these ‘numbers’ get back into the model to improve it?
Not mentioned?

Christian_Simonis · December 5, 2023, 6:13pm

In the batch, there was a nice article from DL.AI on RLHF:

RLHF basics: A popular approach to tuning large language models, RLHF follows four steps:
(1) Pretrain a generative model.
(2) Use the model to generate data and have humans assign a score to each output.
(3) Given the scored data, train a model — called the reward model — to mimic the way humans assigned scores. Higher scores are tantamount to higher rewards.
(4) Use scores produced by the reward model to fine-tune the generative model, via reinforcement learning, to produce high-scoring outputs.

In short, a generative model produces an example, a reward model scores it, and the generative model learns based on that score.

source

If you want to go deeper and gather some hands-on experience, you might wanna check out the GenAI course: Generative AI with LLMs - DeepLearning.AI

Best regards
Christian

DPawson · December 5, 2023, 6:59pm

Thanks Christian. That makes sense, building another model, hopefully atop the main one.
Andrew did not mention that.

Topic		Replies	Views
Sample-Efficient Training for Robots AI Discussions the-batch , ai-discussions	0	85	July 14, 2023
Week 3 general question Generative AI with Large Language Models	3	43	December 1, 2024
Quiz - week3 - RLHF reward hacking - end of video quiz - interpretability Generative AI with Large Language Models week-3	7	355	January 18, 2024
I have a question about the content of the lecture Generative AI with Large Language Models week-3	0	405	August 14, 2023
Week 3: Video RLHF Reward Model Generative AI with Large Language Models week-3	0	316	November 18, 2023

RLHF... How?

Related topics