Why do we need Pydynamic BaseModel to construct queries in the research_plan_node? I noticed that it’s not being used in other node in the Lesson 6 code.
def research_plan_node(state: AgentState):
queries = model.with_structured_output(Queries).invoke([
SystemMessage(content=RESEARCH_PLAN_PROMPT),
HumanMessage(content=state[‘task’])
])
content = state[‘content’] or
for q in queries.queries:
response = tavily.search(query=q, max_results=2)
for r in response[‘results’]:
content.append(r[‘content’])
return {“content”: content}
1 Like
We use the Pydantic BaseModel (Queries) specifically in the research to forcing the LLM to return structured data instead of raw text.
In nodes like plan_node or generation_node, we want the LLM to output standard text (an essay outline or a draft).
However, in research_plan_node, we need the LLM to generate exactly 3 separate search queries so we can loop through them using: for q in queries.queries:
Take another look at the notebook! It actually is used in one other place: the research_critique_node. Both research steps rely on generating search queries, so they both use this schema to safely interact with the Tavily API loop.
Let me know if this helps!
-- Lesly, DLAI
Just to clear up a tiny terminology thing first: the library is actually called Pydantic (though “Pydynamic” sounds like a fantastic name for a Python library!).
At its core, the purpose of Pydantic’s BaseModel is data validation and parsing using standard Python type hints. It ensures that the data your code receives is exactly the type and format your code expects.
By default, Large Language Models return unstructured strings of text. When we need structured output like JSON instead of plain text we can define the structure of the output using a data model that derives from Pydantic’s BaseModel class and pass that data model to the LLM.
1 Like