In a prior post I detailed some struggles I had getting ACP Lesson 8 running locally. These ended up being due to not having the right python version nor the right dependency versions. I wrote this gist to help figure the proper versions.
My second problem was that I had changed the model being used from openai/gpt-4
to openai/gpt-4.1
. There are subtle differences in how OpenAI deals with tool calls that breaks the code found in fastacp.py
in Lesson 8. I eventually realized that this change was the culprit, but not before struggling to debug the code in fastacp.py
.
Which leads to this post. I honestly think Lesson 8 misses the mark. It glosses over, almost completely, the fact that the heavy lifting is actually being done by the AI and its tool use capabilities. The code in fastacp.py
obscures this in several ways. First, it is using smolagents
which itself uses litellm
which itself wraps the ChatGPT API (among others). We have two layers of indirection here. A third is fastacp
itself, which has many lines of code that are there to deal with the differences in how various LLMs deal with tools.
The problem is that without understanding the core functionality (tool usage), it is very hard to debug the code.
So, to help understand tool usage better, I wrote the following using the ChatGPT API directly. It is 153 lines of code (vs 671 in fastacp.py
plus 16 more to define run_hospital_workflow
). It doesn’t have any code to deal with the differences between calling ChatGPT vs Claude vs something else. That is a different (but also important) problem. I’d wager it is easier to understand what the purpose of LiteLLM (or smolagents) is once you’ve built support for one LLM and want to support others.
This should run in a new cell in the DeepLearning.AI’s Jupyter notebook used for this lesson. It is verbose, showing what is sent to the LLM and what is returned (which helps build that mental model of what’s going on). I also added lots of comments to help understand what’s going on.
import json
from typing import Callable, Any
from colorama import Fore
from dotenv import load_dotenv
from openai import OpenAI
from openai.types.chat import ChatCompletionMessageParam, ChatCompletionToolParam, ChatCompletionUserMessageParam, \
ChatCompletionSystemMessageParam, ChatCompletionToolMessageParam, ChatCompletion
from openai.types.shared_params import FunctionDefinition
# System prompt which lists available tools and other instructions to the AI.
system_prompt = (
"You can use tools to answer questions about people.\n"
" + Use `birthday_agent` to get someone's birthday.\n"
" + Use `ice_cream_agent` to get someone's favorite ice cream.\n"
"Once you have the necessary info, answer the user's question in a single paragraph (no double spacing "
"between sentences). Also add a funny quip, joke or fun fact, you funny assistant, you!"
)
# Initial messages to AI
messages: list[ChatCompletionMessageParam] = [
ChatCompletionSystemMessageParam(role="system", content=system_prompt),
ChatCompletionUserMessageParam(role="user", content="When is Doug's birthday and what ice cream does he like? "
"Given their age, suggest a drink that pairs well with the flavor.")
]
# Tool list with some type safety
tools: list[ChatCompletionToolParam] = [
ChatCompletionToolParam(type="function",
function=FunctionDefinition(
name="birthday_agent",
description="Returns the birthday of the person",
parameters={
"type": "object",
"properties": {
"name": {"type": "string", "description": "The person's name"}
},
"required": ["name"]
}
)
),
ChatCompletionToolParam(type="function",
function=FunctionDefinition(
name="ice_cream_agent",
description="Returns the person's favorite ice cream flavor",
parameters={
"type": "object",
"properties": {
"name": {"type": "string", "description": "The person's name"}
},
"required": ["name"]
}
)
),
]
def birthday_agent(name: str) -> str:
"""Return the birthday of the person."""
if name == "Doug":
return "07/04/1972"
else:
return "Unknown"
def ice_cream_agent(name: str) -> str:
"""Return the person's favorite ice cream flavor."""
if name == "Doug":
return "salted caramel"
else:
return "Unknown"
# Simple map from tool name to actual function implementing the tool.
tool_map = {"birthday_agent": birthday_agent, "ice_cream_agent": ice_cream_agent}
def handle_tool_calls(response: ChatCompletion, tool_registry: dict[str, Callable]) -> list[
ChatCompletionToolMessageParam]:
"""Given a ChatCompletion response, loop through all tool calls, invoke the corresponding function,
and return a list of ChatCompletionToolMessageParams that capture the tool, call id and result of the call."""
tool_msgs = []
for call in response.choices[0].message.tool_calls:
tool_fn = tool_registry[call.function.name]
args = json.loads(call.function.arguments) # key/value args (must match fn def)
result = tool_fn(**args)
tool_msgs.append(ChatCompletionToolMessageParam(role="tool", tool_call_id=call.id, content=result))
return tool_msgs
class MyOpenAI:
"""Simple wrapper around the OpenAI API."""
client = OpenAI()
model = "gpt-4.1"
count = 0
def call_api(self, ai_msgs: list[ChatCompletionMessageParam]) -> ChatCompletion:
"""Call the AI chat API, showing messages we are sending as well as response received."""
self.count += 1
print_colored(Fore.MAGENTA, f"\n**************************************************************\n"
f"AI call #{self.count} with messages:")
print_colored(Fore.MAGENTA, json.dumps(ai_msgs, indent=2))
response = self.client.chat.completions.create(
model=self.model,
messages=ai_msgs,
tools=tools,
tool_choice="auto"
)
print_colored(Fore.GREEN, f"\nENTIRE RESPONSE #{self.count}:")
print_colored(Fore.GREEN, response.model_dump_json(indent=2))
return response
def print_colored(color: Fore, text: Any) -> None:
"""Helper function for printing colored text."""
print(f"{color}{text}{Fore.RESET}")
def run_agent():
"""Main function which makes two calls to the AI model. The first sends the system prompt,
which defines the available tools. We expect a return message asking to call these
tools. We do so and add the answers to the message list. We then make a second call
to the AI, which should have all the context it needs to respond with a final answer."""
ai = MyOpenAI()
# Initial call
response = ai.call_api(messages)
# need to add response message to message history (which should be tool calls request
# from the 'assistant' aka ChatGPT)
response_msg = response.choices[0].message
messages.append(response_msg.model_dump()) # Add assistant tool_calls message to message list
print("Extracted response message (with tool calls):")
print(response_msg.model_dump_json(indent=2))
# process tool calls, append answers as message
tool_messages = handle_tool_calls(response, tool_map)
messages.extend(tool_messages) # Add tool answers to message list
print_colored(Fore.BLUE, "\nTOOL CALL RESULTS:")
print_colored(Fore.BLUE, json.dumps(tool_messages, indent=2))
# Call again, with answers
response = ai.call_api(messages)
# Output final message
print_colored(Fore.CYAN, "\nFINAL ANSWER:")
print_colored(Fore.CYAN, response.choices[0].message.content)
run_agent()
While building this, I found that the fastacp.py
code wasn’t doing a couple of things that seem important. First, it wasn’t returning the tool-request messages back to the LLM (so the LLM has proper context of all messags). Second, it wasn’t using the tool_call_id
to link the response back to the tool request. Instead, it was doing this obliquely, as seen in this snippet:
{'content': [{'text': 'Current memory state:\n'
'- health_agent_response: Yes, serious rehabilitation ...'
'- policy_agent_response: The standard waiting period ...'}]}
Maybe that is necessary to support multiple LLMs, but it seems fragile.
In any case, I wanted to share this with both the instructors and with fellow students. Maybe it will help someone else!