Question on MCP Server Orchestration at Scale

Dear DeepLearning.AI Community,

I recently completed the MCP: Build Rich-Content Apps… short course and was really impressed by the developments! I have a question/challenge I’d like to raise to the community:

As the number of MCP Servers/Clients scales and the Registry API is set up for autonomous MCP Server discovery, how do we avoid the same problem MCP aims to solve? Specifically, with so many MCP servers in the Registry API, the context available to the model orchestrating, or deciding which server to use, could become a limiting factor.

Is there a way to develop a standardized orchestration method, essentially, a standard way for the model to choose which MCP server to use in the Registry API, rather than relying purely on inference?

I’m hesitant to think that layering multiple orchestrators is the solution, as there’s always a trade-off between the amount of options an orchestrator can consider (context) and the cumulative probability of errors increasing with each additional layer in the decision chain.

An example solution might be that instead of having the LLM interpret the tools, prompts, and resources available on every server in the Registry API, could we use RAG (or a similar approach) to first retrieve the most relevant servers, and then have the orchestrator choose from that subset?

I think it’s important to consider this before the Registry API is set up, as the solution may require adjustments to how MCP Servers are described and added to the registry. In the simplified RAG example, each MCP Server would need a description when added to the registry, or perhaps a concatenation of the descriptions of all its tools, resources, and prompts, which would then need to be embedded for RAG to work effectively.

I’m still in the early stages of learning about MCP, so I welcome any corrections or insights if I’m missing something.

Thank you!

1 Like

this is a very high level and insightful question about the future of mcp scaling you are touching on a real challenge which is the discovery problem as the ecosystem grows

your suggestion of using a rag based approach to filter mcp servers is actually a very solid architectural pattern instead of feeding every single server description into the prompt context you first perform a semantic search to retrieve the top n most relevant servers based on the user intent

this avoids context overflow and reduces the hallucination risk you mentioned regarding multiple orchestration layers

to make this work effectively each mcp server would indeed need a clear and standardized metadata description or a summary of its capabilities at the registry level so it can be indexed and embedded properly

layering multiple orchestrators can be messy as you said but a single smart retrieval layer (like rag) before the final decision is a much cleaner way to handle thousands of servers

thank you for sharing these thoughts it is exactly the kind of foresight needed while the registry api is still in its early stages

keep up the great thinking