Streaming the final output ONLY

Ihsan · July 8, 2024, 9:04am

My question is related to → Persistence and Streaming → Streaming tokens.

How can I stream only the final response to a user’s question? The current code, also given below, streams the ‘tool calling’ and ‘Back to Model’, as well.

messages = [HumanMessage(content="What is the weather in SF?")]
thread = {"configurable": {"thread_id": "4"}}
async for event in abot.graph.astream_events({"messages": messages}, thread, version="v1"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        content = event["data"]["chunk"].content
        if content:
            # Empty content in the context of OpenAI means
            # that the model is asking for a tool to be invoked.
            # So we only print non-empty content
            print(content, end="")

Output:

Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'current weather in San Francisco'}, 'id': 'call_d1uenpZrgwcruC5UZpiWEyYx'}
Back to the model!
The current weather in San Francisco involves considerable cloudiness with occasional rain showers. The humidity is around 66%, and the wind is coming from the SSW at 10 mph. The high temperature is expected to be in the upper 60s.

Topic		Replies	Views
Persistence and Streaming \| model says ' I cant find the current weather for San Fransisco' in AsqliteSaver while streaming tokens AI Agents in LangGraph	1	31	September 2, 2025
Lesson4 - Persistence and Streaming - Last part gives error "SyntaxError: 'async for' outside async function" AI Agents in LangGraph	5	595	September 21, 2024
Lesson 4 - last async cell DLAI behaviour vs local hosted AI Agents in LangGraph dl-ai-learning-platform	0	37	March 26, 2025
LLM Agent Streaming with Streamlit AI Discussions ai-discussions	1	92	April 11, 2024
Lesson 2: Tools not getting executed AI Agents in LangGraph	0	146	July 14, 2024

Streaming the final output ONLY

Related topics