Module 2, reflection loop

wkusnierczyk · November 30, 2025, 3:09pm

The ungraded lab ‘Chart Generation’ implements a linear workflow with one agent generating an initial chart generation code, and another agent reflecting on it and creating an updated version.

I was curious about the result of a repeated reflection; i.e., feeding the result of a reflection (updated code) to the reflecting agent once more, to generate another version of the updated code. For this, I added a loop to the run_workflow code, with a parameter to specify the number of reflection iterations (by default, 1).

Interestingly, the first few iterations let to what I’d consider improved graphics. With more iterations (say, more than 4), the reflection was resulting in code generating graphs that instead of improving, we bugged with issues such as the legend overlapping with the bars.

The iteration at which reflection stops being helpful will likely depend on the problem at hand. With a different user input, it might be earlier or later than 5. I would be interested in a workflow where the reflection is looped as long, and only as long, it brings further improvements.

As the code is not part of any graded assignment, I attached the downloaded notebook for anybody’s interest.

reflection_loop.ipynb (5.1 MB)

wkusnierczyk · November 30, 2025, 3:14pm

This is chart V4

wkusnierczyk · November 30, 2025, 3:14pm

This is chart V5

wkusnierczyk · November 30, 2025, 3:16pm

And here (V6) there is a clear deterioration, that should not really be proposed by the reflecting agent as a valid improvement

rmwkwok · December 1, 2025, 1:51am

Another interesting observation is that, the feedbacks may fight each other:

Changing multiple things at a time may end up in unwanted results? (Only) Phrasing feedback as actionables may not be most ideal? Just some thoughts.

Cheers,
Raymond

wkusnierczyk · December 1, 2025, 10:50am

Yes. For an automated optimization workflow, I would like it check for:

fixed point, i.e., further feedback leading to no improvements (e.g., “The chart looks good, no updates needed”); or
detection of deterioration or reverting to previously criticised layout (e.g., “The last update did not lead to improvements, and instead made the chart look worse” or “… look as in an already previously criticised version”).

The loop should return code that it finds optimal, not just updated in a fixed number of iterations. The feedback would read like, “In n reflection iterations, I made the following improvements. Further iterations did not further improve, or even worsened, the outcome”.

This could maybe be achieved by employing an additional LLM (a ‘judge’) asked to compare two subsequent charts and let the loop stop if it does not find the updated chart better.

Should not be too difficult to implement (a graded assignment idea?).

rmwkwok · December 2, 2025, 1:45pm

That sounds like to establish some termination conditions. Good idea!

I think it would be nice if the reflection process can converge to some principles that apply specifically to the dataset in hand, and then it produces the final product that follows those principles.

Topic		Replies	Views
Reflection Pattern Agentic AI week-module-3 , dl-ai-learning-platform	1	22	March 13, 2026
Ungraded lab - M2 - The chart after reflection is worse than before Agentic AI week-module-2 , course-topic	1	51	October 16, 2025
Is fitting all the reflection prompts in the original LLM comparable to multi-step reflection? Agentic AI week-module-2 , dl-ai-learning-platform	2	22	March 16, 2026
Error when code generation is changed Agentic AI week-module-2 , ai-discussions , course-topic	7	129	December 10, 2025
General Question : Why does LLM output the same graph on every trigger ? Why is there no variety in output Agentic AI week-module-2 , ai-discussions , project , dl-ai-learning-platform	3	32	March 9, 2026

Module 2, reflection loop

Related topics