Hi, don’t know if this is the place for that kind of feedback, but it seems to be that there is an error on the following slide:
That is, for the PPO diagram, it should not reference “Reference Model” twice, but replace one of them with “Reward Model”.
