In the Agentic AI course, Prof. Ng showed a chart of reflection improving response quality by 20 pts+ for every model. Why don’t LLM builders build reflection into their internal prompt by default? They could achieve a significant bump in performance with the simple tweak. I understand it’d cost more compute per query, but so does building a bigger and more powerful model.
Why LLMs Don’t Use Reflection by Default (Even Though It Boosts Quality)
Andrew Ng’s Agentic AI lectures show that reflection can significantly improve model performance, sometimes by more than 20 points. It’s a natural question: If reflection works so well, why don’t model builders simply bake it into the system prompt?
There are several practical reasons.
1. Reflection multiplies compute cost
Reflection isn’t a small add‑on — it requires multiple full forward passes.
A single “reflect → revise” loop can double or triple the compute needed for one answer.
Model builders can’t impose that cost on every user and every task.
2. Many tasks don’t benefit from reflection
Reflection helps with reasoning, planning, coding, and math.
But it slows down or even degrades:
-
simple Q&A
-
casual chat
-
summarization
-
brainstorming
-
short‑form writing
If reflection were always on, models would feel slower and less responsive for most everyday use cases.
3. Reflection introduces new failure modes
While it improves average quality, it can also:
-
reinforce incorrect assumptions
-
hallucinate self‑critique
-
drift away from the user’s intent
-
overcomplicate simple tasks
Model builders need predictable behavior across billions of queries, not just higher benchmark scores.
4. There is no single “best” reflection strategy
Different tasks benefit from different patterns:
-
chain‑of‑thought
-
self‑critique
-
multi‑agent debate
-
hypothesis‑and‑test loops
-
tool‑aware planning
Hard‑coding one style would limit flexibility and reduce performance in domains where another strategy works better.
5. Reflection belongs in the agent layer, not the model layer
Modern AI systems separate the model from the agent:
-
The model provides fast, general language capabilities.
-
The agent adds planning, reflection, verification, and tool use.
This keeps the base model efficient while allowing developers to add reflection only when it’s beneficial.
Summary
Reflection improves quality, but it also increases cost, slows simple tasks, introduces new errors, and isn’t one‑size‑fits‑all. That’s why it lives in the agent layer rather than inside the model’s default prompt.
Practical Note for Learners
Reflection is also a skill that can be practiced interactively with an AI learning partner. Asking a model to critique, refine, or challenge your reasoning provides many of the same benefits Andrew Ng highlights — without requiring reflection to be built into the model itself.
hi @dapobbat
If you are referring the chart generator lab, I think prof. Ng intentionally separated each steps into separate section for better understanding of agent interactions.
if you come across complex agentic ai, especially have RAG architecture, you actually come across many prompts being told to reflect, analysis and summarise in the query only, by doing this not only performance improves sometimes based on the agentic design and budget allocation to create such agents can save lots computation cost but depends on tasks based prompt to the desired output.
Regards
DP