I’d appreciate it if someone could clarify the following point.
Module 5 is dedicated to the topic of “workflow planning.” Andrew raised the question of which output format is better to use when getting planning results from an LLM. For example, using JSON allows us to see the sequence of functions along with detailed arguments.
However, this confused me a bit.
In Module 3, (Week 3,Tool Usage), we learned that we can simply:
provide the model with all details about the available tools in a specific format,
and the model will respond with a message indicating which tool (or even a sequence of tools) it needs to use.
So why, in Module 5, do we now need to think about planning tool execution in a different way?
I think the idea is to use code as a plan for those situations where too many specific tools would need to be defined. I saw the term ‘brittle tool soup’ somewhere.
However, I think using code brings along other problems. (see my question in the forum)
Thank you Phil01, the use case about planning via code is completely clear.
But apart from the cases with code approach - Andrew [edited] mentioned in a video that there are several ways to prepare plans: JSON, XML, Plain Text and the JSON seems more reasonable choice.
So my question was - why do we need that at all? Why should we explicitly get this plan instead of let the model prepare the standard sequence of tools it needs described in it’s response.
Might be the case that I miss something in the Week3 (Tools) block, and for some reason we can not rely on that way.
Here is the piece from Ungraded Lab Module 5: “Instead of asking the model to output a plan in JSON and running it step-by-step with many tiny tools,…”
Question: So if we would like to run step-by-step many tiny tools, so we need this JSON plan? And describe the tools and put them to CLIENT.chat.completions.create(…,tools=[..],…) parameter would be not enough to get back response where the model will tell us the list of necessary tools which should be called?
If I understand you correctly, I think you are questioning the choice of JSON? My answer is JSON is easier to use.
If LLM responds
please use the tool query_product to find the first 5 sports products that price below $7,200",
then to use it, you will need to extract “query_product”, “first”, “5”, “below”, “7200”. How to extract? Right? We might have to make another LLM call at the end, so why don’t we ask it to do it in the first place, like getting us something like below?
When you refer to video, it would be better to provide the module number(s) and the name(s) of the video(s) as well as the time mark(s) for the questioned part(s) of the video(s). Then, we can have a more focused discussion. For example, module 3 video X time MM:SS Andrew said “…” and module 5 video Y time MM:SS Andrew said “…”. The transcript is available in every video, so you may just copy and paste it.
Without knowing which videos they are, from just your message, I think we should not interpret module 5 as some kind of a complete change to what we have learned in module 3. I think the introduction of JSON is just an extension of the set of ways LLM responds, and it’s easier to handle a JSON response.
If I understand you correctly, I think you are questioning the choice of JSON? My answer is JSON is easier to use.
To be honest, not exactly The question was: why do we need planning in a that way while we have a way to solve it via tools.
Here is the piece from Ungraded Lab Module 5: “Instead of asking the model to output a plan in JSON and running it step-by-step with many tiny tools,…”
Question: So if we would like to run step-by-step many tiny tools, so we need this JSON plan? And describe the tools and put them to CLIENT.chat.completions.create(…,tools=[..],…) parameter would be not enough to get back response where the model will tell us the list of necessary tools which should be called?
Hello @Aleksei250, can you provide one example for “that way” and one example for “a way”? It would be even better if you provide references to these two ways.