Potential Review Needed for Notebook Lab 3

PABLO_SAMPAIO · May 8, 2024, 8:07pm

I think the “Jupyter” notebook used in week 3 for Lab 3 needs some revisions. Here are some issues, referenced by the markdown sections of the notebook:

Section 2.2

(1) In the text (markdown), there’s a part that seems to be a reviewer’s comment. (By the way, this comment was incorporated into the text correctly, as can be seen by comparing the condensed text present in the video with the text we see when opening the notebook). I think the review comment can be deleted.

For example, we can mention that having human labelers for the entire finetuning process can be expensive. A practical way to avoid that is to use a reward model.

use feedback generated by a model

(2) In the same section, there are also some cells with repeated code from previous cell, like:

print(sentiment_pipe(non_toxic_text, **reward_logits_kwargs))
print(sentiment_pipe(non_toxic_text, **reward_probabilities_kwargs))

(3) The ‘toxic_text’ and ‘non_toxic_text’ used as examples are different from the video. (This is related to the next).

(4) While the previous issues were relatively minor, I encountered a significant inconsistency in the EXECUTION as well. I tried the exact ‘toxic_text’ from the video (starting with “You are…”), and I got the following output (indicating that it is NOT toxic):

logits [not hate, hate]: [4.697163105010986, -4.222471237182617]
probabilities [not hate, hate]: [0.999866247177124, 0.00013371932436712086]
reward (low): [4.697163105010986]

Section 2.3

(5) In the end of section 2.3, the toxicity results of the model differ from those shown in the video (presumably) for the same model.:

toxicity [mean, std] before detox: [0.046981223255649886, 0.05390841506846864]

(6) The training results also exhibit discrepancies, with more than half the results shown in the last cell exhibiting increase in the toxicity measure (of the output of the trained model). While I acknowledge that random number generation may influence the training process, I recommend double-checking.

PABLO_SAMPAIO · May 29, 2024, 5:43pm

Any updates or actions taken on this matter?

Thank you!

TMosh · May 29, 2024, 5:47pm

@chris.favila, are you the tech lead for this course?

chris.favila · May 29, 2024, 10:32pm

Hi Pablo, and thank you for the feedback! We’ll review the notebook and make changes as necessary. At first glance, I agree that the randomness might play a significant factor in the results. But we’ll take a closer look and see what other things might be causing the discrepancies. Thanks again!

Topic		Replies	Views
Lab 3, 3.3 Toxicity is worse after fine tuning according to metrics Generative AI with Large Language Models week-module-3	14	484	November 4, 2024
Feedback on Lab 3 Generative AI with Large Language Models week-module-3	1	415	September 15, 2023
Week3-Lab3-Detoxification Generative AI with Large Language Models ai-discussions	2	31	September 16, 2024
Mistake in Lab 3 training loop Generative AI with Large Language Models week-module-3	6	397	July 14, 2025
Reinforcement learning made my lab model MORE toxic Generative AI with Large Language Models lab-help	2	48	February 17, 2025

Potential Review Needed for Notebook Lab 3

Section 2.2

Section 2.3

Related topics