Module 2, reflection loop

The ungraded lab ‘Chart Generation’ implements a linear workflow with one agent generating an initial chart generation code, and another agent reflecting on it and creating an updated version.

I was curious about the result of a repeated reflection; i.e., feeding the result of a reflection (updated code) to the reflecting agent once more, to generate another version of the updated code. For this, I added a loop to the run_workflow code, with a parameter to specify the number of reflection iterations (by default, 1).

Interestingly, the first few iterations let to what I’d consider improved graphics. With more iterations (say, more than 4), the reflection was resulting in code generating graphs that instead of improving, we bugged with issues such as the legend overlapping with the bars.

The iteration at which reflection stops being helpful will likely depend on the problem at hand. With a different user input, it might be earlier or later than 5. I would be interested in a workflow where the reflection is looped as long, and only as long, it brings further improvements.

As the code is not part of any graded assignment, I attached the downloaded notebook for anybody’s interest.

reflection_loop.ipynb (5.1 MB)

1 Like

This is chart V4

1 Like

This is chart V5

1 Like

And here (V6) there is a clear deterioration, that should not really be proposed by the reflecting agent as a valid improvement

1 Like

Another interesting observation is that, the feedbacks may fight each other:


Changing multiple things at a time may end up in unwanted results? (Only) Phrasing feedback as actionables may not be most ideal? Just some thoughts. :wink:

Cheers,
Raymond

1 Like

Yes. For an automated optimization workflow, I would like it check for:

  • fixed point, i.e., further feedback leading to no improvements (e.g., “The chart looks good, no updates needed”); or
  • detection of deterioration or reverting to previously criticised layout (e.g., “The last update did not lead to improvements, and instead made the chart look worse” or “… look as in an already previously criticised version”).

The loop should return code that it finds optimal, not just updated in a fixed number of iterations. The feedback would read like, “In n reflection iterations, I made the following improvements. Further iterations did not further improve, or even worsened, the outcome”.

This could maybe be achieved by employing an additional LLM (a ‘judge’) asked to compare two subsequent charts and let the loop stop if it does not find the updated chart better.

Should not be too difficult to implement (a graded assignment idea?).

1 Like

That sounds like to establish some termination conditions. Good idea!

I think it would be nice if the reflection process can converge to some principles that apply specifically to the dataset in hand, and then it produces the final product that follows those principles.