✨ New course! Enroll in Reinforcement Fine-Tuning LLMs with GRPO

Community-Team · May 21, 2025, 2:42pm

Join Reinforcement Fine-Tuning LLMs with GRPO , built in collaboration with Predibase, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and Machine Learning Lead.

Reinforcement Fine-Tuning (RFT) is a technique for adapting LLMs to complex reasoning tasks like mathematics and coding. RFT leverages reinforcement learning (RL) to help models develop their strategies for completing a task, rather than relying on pre-existing examples as in traditional supervised fine-tuning. One RL algorithm, called Group Relative Policy Optimization (GRPO), is beneficial for tasks with verifiable outcomes and can work well even when you have fewer than 100 training examples. Using RFT to adapt small, open-source models can lead to competitive performance on reasoning tasks, giving you more options for your LLM-powered applications.

In this course, you’ll take a technical deep dive into RFT with GRPO. You’ll learn how to build reward functions that you can use in the GRPO training process to guide an LLM toward better performance on multi-step reasoning tasks.

yliu95 · May 29, 2025, 1:37pm

Hi,

May I ask where shall we find the slides to this course?

Thank you!

TMosh · May 29, 2025, 5:15pm

Short courses do not provide the lecture slides.

Topic		Replies	Views
Why does not anyone apply GRPO fine tuning on a GRPO fine tuned model Reinforcement Fine-Tuning LLMs with GRPO	2	62	May 22, 2025
Beyond Group-Based Feedback: The Future After GRPO in Aligning Large Language Models AI Discussions ai-discussions	0	14	June 6, 2025
Is cold start SFT always necessary before GRPO Reinforcement Fine-Tuning LLMs with GRPO	0	92	May 22, 2025
I have a question about the content of the lecture Generative AI with Large Language Models week-module-3	0	407	August 14, 2023
Sample-Efficient Training for Robots AI Discussions the-batch , ai-discussions	0	86	July 14, 2023

✨ New course! Enroll in Reinforcement Fine-Tuning LLMs with GRPO

Related topics