Lab 3, 2.2 Reward Model

OLOWOBOKO_EMMANUEL · January 6, 2024, 9:33am

How can I extend the reward model beyond checking for toxicity in the generated output to other specific criteria or options? In lab 3 2.2, the approach used is sentiment analysis using “Meta AI’s RoBERTa-based hate speech model” which gives a higher reward if there is higher a chance of getting class nothate as an output.

Topic		Replies	Views
Difficulty understanding Roberta reward model behavior Generative AI with Large Language Models week-module-3	2	421	November 20, 2023
Question about evaluate_toxicity and sentiment_pipe function Generative AI with Large Language Models week-module-3	1	270	January 10, 2024
Week 3: Video RLHF Reward Model Generative AI with Large Language Models week-module-3	0	319	November 18, 2023
Potential Review Needed for Notebook Lab 3 Generative AI with Large Language Models week-module-3	3	160	May 29, 2024
The magic reward model? Generative AI with Large Language Models week-module-3	7	563	July 11, 2023

Lab 3, 2.2 Reward Model

Related topics