šŸ†˜ Help: PPO Training with HuggingFace Model for Text Summarization using TRL

Hi everyone :waving_hand:,

I’m trying to train a Hugging Face pretrained model for Telugu text summarization using PPO (Proximal Policy Optimization) with the trl library, but I’m running into some issues.
with PPO config setup and library compatibility issues and training ,
another issue i was facing which is size mismatch .
please help me or give any example

Check out the Generative AI for Large Languages course, one of the Labs uses PPO in the training, maybe you can use some of that knowledge.

1 Like