Module 2: "RL: PPO and GRPO Algorithms" Slide Error

Hi, don’t know if this is the place for that kind of feedback, but it seems to be that there is an error on the following slide:

That is, for the PPO diagram, it should not reference “Reference Model” twice, but replace one of them with “Reward Model”.