The agent is very sensitive to the user input. If the user writes “I would like to Return 2 Aviators”, so with a capital “R”, the whole process fails. The code generated may or may not convert everything to lowercase; this seems brittle to me.
My understanding of Andrew’s statement about the approach with a tool soup being brittle is that it refers to the possiblity that a required tool may not be present in the tool soup, and that this may lead to either a break in execution or the invocation of an incorrect tool.
As Andrew states in the first video of this module, allowing an LLM to plan can be an experimental approach and can sometimes makes the system a little bit hard to control. So this approach can also be brittle but in a different way. In terms of making required tools available by means of code generation and execution, it may be more versatile (and in that sense less ‘brittle’) than a tool soup.