Hello everyone,
We would like to know how many tokens are used in this lab.
Also, If we use larger models, what are the expected tokens when we generate multi-agents with the image?
Hello everyone,
We would like to know how many tokens are used in this lab.
Also, If we use larger models, what are the expected tokens when we generate multi-agents with the image?
hi @Abdulrahman_Mohammed_Albdwi
kindly mention your query in detail, for example which part of lab you are pointing too, most of the lab data details you should be able to find in the lab itself or in File==> Open section.
Coming to large model multi-agent system, will depend on task you are planning and/or hardware support - CPU-GPU as well as the data in hand being used as you mentioned images being the data in focus, you can always try quantization approach, but the strategical approach would be more based on task at hand.
Thanks for your kind reply.
While I was practicing the lab (M5 Agentic AI - Market Research Team) of the Agentic AI course, I was aware of the token count for the whole multi-agent pipeline. Since these are large queries with multiple agents, I was thinking this will take a lot of tokens to apply like this application in the real world.
If you have access to the lab , you could see the query. There are many queries for each Agen. Could you please check this link https://learn.deeplearning.ai/courses/agentic-ai/lesson/cnd59gh0b/ungraded-lab%3A-market-research-team
You raise a valid concern. In practice, multi-agent systems can consume a significant number of tokens because each agent has its own prompts, reasoning, tool calls, and outputs that are passed between agents.
However, it isn’t possible to accurately estimate the total token usage before running the workflow. The final count depends on several factors, including:
The user’s query and its complexity.
The amount of information each agent retrieves or generates.
The prompts and context provided to each agent.
The model being used, since different models tokenize text differently and may produce outputs of different lengths.
You can measure the total token usage after running the pipeline using the usage statistics returned by the API or your LLM framework. Those measurements are the best way to estimate real-world costs for your specific application.
So while larger models generally tend to incur higher costs because they often generate longer responses and have higher pricing, there isn’t a fixed token count that can be predicted in advance or directly transferred from one model to another.
I did check the lab you pointed.
With multi-agent system the added advantage is one can handle 10 times higher token than a single agentic system, but with multi-agent system using a large model with huge tokens can cause context window explosion. This issue is address by search tool which already has smart context chunking like in the lab marketing search tool you are pointing uses tavly search tool which extracts short semantic content snippets (max 500 characters each) rather than raw HTML. This converts what would be a 300,000-character payload into a lean 1,000 to 3,000-character data signal, handling the concern large token response for any lengthy prompts.
Another point you need to check in the lab metadata, the results or the search is also assigned with limitation, for example for lab market research, in the metadata if you check you will find in the get available tool max retries has been assigned to 3 and max result to top 5 result, handling the hardware limitation and as context specific response when using a large model like gpt-4 o.
hope this cleared your doubt, feel free to ask if more doubt.
Regards
Dr. Deepti