Problem/Motivation
The challenge with MCP right now is there seems to be a soft cap around 40 tools, and because all tools are sent on every request, more noise = more hallucinations and more failures. AI Agents on the other hand have similar degradation with tool counts, but are also currently lacking a reliable short term memory layer for better context management, meaning chaining many tool calls together quickly hits caps for requests/tokens.
Proposed resolution
We can reduce some of the tool bloat by introducing only 3 function calls/mcp tools instead of all Tool API tools:
tool_discovery- Searching and finding the right tooltool_planning- Understanding what arguments a tool needs to runtool_execute- Call the actual tool with said arguments
This gives mcp/agents the ability to search for tools dynamically as needed, fully understand what they need to call the tool, then call the tool with greater success. We can also experiment with different strategies for discovery(RAG search, tagging, toolsets, etc) and potentially optimize some of the planning required depending on specific use cases.
This approach also provides us more future proofing. Currently, large tool handling is being solved primarily on the server side. Over time we may see clients add support for larger tool counts, or middle layers that help do some of the sorting automatically. If that occurs, we still have the ability to expose all individual tools in the future, while having a short term solution as well.
One thing this does not solve is for an improved short term memory layer (see #3528730: Create ShortTermMemoryPlugin) but it does remove one part of the equation. This approach also helps to unlock #3545828: Introduce InputDefinitionRefinerInterface.
(This issue may split out into separate issues, particular if the MCP part is integrated on the MCP module.
Remaining tasks
TBD
User interface changes
TBD
API changes
TBD
Data model changes
TBD
Comments
Comment #2
michaellander commentedComment #3
rakhimandhania commented