TLDR: A new research paper introduces a knapsack-inspired framework for automated agent composition, allowing AI systems to dynamically select optimal tools and sub-agents based on real-time performance, cost, and compatibility. This “online knapsack composer” significantly outperforms traditional semantic retrieval methods, achieving higher success rates at lower costs in both single-agent and multi-agent setups by iteratively testing components.
In the rapidly evolving landscape of artificial intelligence, the creation of sophisticated agentic systems is a frontier of research. These systems are designed to be autonomous, capable of complex reasoning, utilizing various tools, and collaborating to solve problems. However, as the number of available AI components—such as models, APIs, and specialized agents—grows, developers face a significant challenge: how to effectively select and combine these components to build an optimal system.
Traditional methods for designing agentic systems often rely on static, semantic retrieval. This means they pick components based on their descriptions or metadata. The problem with this approach is threefold: component descriptions might not accurately reflect real-world performance, selection criteria often overlook the balance between cost and utility, and static architectures struggle to adapt as requirements change. This leads to what the researchers call the “paradox of choice,” where an abundance of options makes effective decision-making difficult.
A new research paper titled “Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection” introduces an innovative framework to tackle these challenges. Authored by Michelle Yuan, Khushbu Pahwa, Shuaichen Chang, Mustafa Kaba, Jiarong Jiang, Xiaofei Ma, Yi Zhang, and Monica Sunkara, this work proposes a structured, automated method inspired by the classic knapsack problem in optimization. You can read the full paper here: Automated Composition of Agents: A Knapsack Approach for Agentic Component Selection.
The core idea is to treat the selection of AI components—like tools or sub-agents—as a knapsack problem. Imagine you have a knapsack with a limited budget (capacity) and many items (AI components), each with a cost and a potential value (success rate). The goal is to fill the knapsack with items that maximize the total value without exceeding the budget. In this context, the “composer agent” is responsible for systematically identifying, selecting, and assembling the best set of components, considering performance, budget constraints, and compatibility.
What makes this approach particularly effective is its dynamic nature. Unlike static retrieval, the composer agent doesn’t just rely on descriptions. Instead, it continuously probes and tests the actual capabilities of candidate components in a “sandbox” environment. These real-time trials measure how reliably a component performs under various conditions, allowing the system to estimate its true utility. For example, an information-seeking agent might initially consider a specialized scientific search tool, but through testing, discover that a more generalized search tool is more effective across diverse queries, or vice-versa depending on the task’s specific needs.
The framework formalizes agent composition as a constrained optimization problem, bridging the gap between modular AI design and operations research. It involves a workflow where the composer agent first breaks down task descriptions into required skills, then retrieves candidate components, and finally tests them. If a component’s value-to-cost ratio meets a dynamic threshold, it’s added to the agentic system. This iterative process ensures that the selected components are not only relevant but also cost-effective and truly capable.
The researchers conducted extensive empirical evaluations using Claude 3.5 Sonnet across five benchmarking datasets. The results were compelling: their online-knapsack-based composer consistently achieved higher success rates at significantly lower component costs compared to traditional retrieval baselines. In single-agent setups, the success rate improved by up to 31.6%. For multi-agent systems, where agents are selected from an inventory of over 100, the success rate jumped from 37% to 87%. This substantial performance gap highlights the method’s robust adaptability across different domains and budget constraints.
The paper also discusses the importance of “prompt optimization,” where the agent’s system prompt is refined based on feedback from tool sandboxing trials. This helps the agent better understand when to invoke specific tools, leading to further performance boosts, especially in tasks requiring precise query formulation and error recovery.
Also Read:
- The Rise of Self-Governing AI: From External Control to Intrinsic Intelligence
- Boosting AI Agent Efficiency: How SpecCache Tackles Web Interaction Delays
In conclusion, this research offers a powerful new paradigm for designing AI systems. By moving beyond static descriptions and embracing dynamic, real-time evaluation, the knapsack approach enables the creation of more efficient, adaptable, and high-performing agents. This is a significant step towards reducing “hidden technical debt” in machine learning systems and streamlining the assembly of complex AI architectures.


