In Lab 3 of Evaluating AI Agents course:
query = SpanQuery().where(
# Filter for the `LLM` span kind.
# The filter condition is a string of valid Python boolean expression.
"span_kind == 'LLM'",
).select(
question="input.value",
tool_call="llm.tools"
)
# The Phoenix Client can take this query and return the dataframe.
tool_calls_df = px.Client().query_spans(query,
project_name=PROJECT_NAME,
timeout=None)
tool_calls_df = tool_calls_df.dropna(subset=["tool_call"])
tool_calls_df.head()
I think tool_call="llm.tools"
is incorrect, this only retrieves all the available tools given to the LLM, but not the chosen tool returned by the LLM.
Can you have a check?