Is cold start SFT always necessary before GRPO

saad34 · May 22, 2025, 4:47am

hi,
i have gone through all the GRPO papers(except deepseek R1, where initially they trained R0 purely via GRPO) and in all of those they use cold start SFT before GRPO, i wanted to ask that if a model was good at following the output formatting and had no language issues, wouldn’t continued pre training be an alternative (if you wanted to add some extra domain knowledge), why didn’t anyone use it?

Topic		Replies	Views
Why does not anyone apply GRPO fine tuning on a GRPO fine tuned model Reinforcement Fine-Tuning LLMs with GRPO	2	86	May 22, 2025
C4_W4 Reformer Chatbot Assignment - Training model from scratch NLP with Attention Models week-module-4	2	562	March 5, 2022
Creating a GRU model using Trax NLP with Sequence Models week-module-2	3	734	July 26, 2022
Finetune model with conversations GenAI with LLMs Resources	1	347	August 26, 2023
Full Fine-tuning Initial Weights Generative AI with Large Language Models week-module-2	1	409	August 12, 2023

Is cold start SFT always necessary before GRPO

Related topics