Agent Foundation Models: A Unified Approach to AI Problem Solving

TLDR: A new AI paradigm called Chain-of-Agents (CoA) allows large language models to perform complex multi-agent, multi-tool problem-solving within a single model. This approach, used to create Agent Foundation Models (AFMs), is trained through multi-agent distillation and reinforcement learning, leading to state-of-the-art performance in web and code tasks, while significantly improving computational efficiency and generalization compared to traditional multi-agent systems.

The world of artificial intelligence is constantly evolving, with recent advancements in large language models (LLMs) and multi-agent systems showcasing impressive capabilities in tackling complex challenges, from in-depth research to intricate coding and mathematical reasoning. However, many existing multi-agent systems face significant hurdles: they often rely on manual prompt engineering, leading to computational inefficiencies, limited adaptability, and an inability to benefit from continuous data-driven learning.

Addressing these limitations, the OPPO AI Agent Team has introduced a groundbreaking new paradigm called Chain-of-Agents (CoA). This innovative approach enables LLMs to perform complex, multi-turn problem-solving, much like a traditional multi-agent system, but entirely within a single model. Imagine a single AI brain that can dynamically activate different specialized ‘tool agents’ and ‘role-playing agents’ to simulate a collaborative team, all working together seamlessly and end-to-end.

In the CoA framework, the model intelligently orchestrates various agents. These include ‘Role-playing Agents’ such as a Thinking Agent to manage the reasoning flow, a Plan Agent to break down tasks, a Reflection Agent for self-critique, and a Verification Agent to ensure correctness. Alongside these are ‘Tool Agents’ like a Search Agent for optimized queries, a Crawl Agent for content extraction, and a Code Generate Agent for code execution in a sandbox environment. This dynamic coordination within one model eliminates the need for complex prompt and workflow engineering, significantly reducing the computational overhead typically associated with inter-agent communication in conventional multi-agent systems.

To instill these end-to-end Chain-of-Agents problem-solving abilities into LLMs, the researchers developed a multi-agent distillation framework. This process involves distilling the capabilities of state-of-the-art multi-agent systems into CoA-compatible trajectories, which are then used for ‘agentic supervised fine-tuning’. Following this, ‘agentic reinforcement learning’ is applied to further refine the models’ performance on verifiable agentic tasks. The resulting models are termed Agent Foundation Models (AFMs).

Empirical studies have demonstrated that AFMs achieve new state-of-the-art performance across a wide array of benchmarks in both web agent and code agent settings. For instance, AFMs have shown superior success rates on challenging web agent benchmarks like GAIA, BrowseComp, and HLE, and impressive results in code generation and mathematical reasoning on LiveCodeBench and AIME2025. Beyond performance, AFMs also boast remarkable computational efficiency, reducing inference costs (in terms of token consumption) by a substantial 84.6% compared to traditional multi-agent systems, while maintaining competitive performance.

Furthermore, the research highlights AFM’s strong generalization capabilities, particularly its ability to handle unseen agents. For example, a code agent model trained only on code and math tasks could successfully orchestrate unseen web search and visual inspector tools when their descriptions were provided. This indicates a robust understanding of tool invocation formats and dynamic adaptation.

Also Read:

The OPPO AI Agent Team has made their entire research, including model weights, training and evaluation code, and training data, fully open-sourced. This significant contribution provides a solid foundation for future research and development in agent models and agentic reinforcement learning. For more detailed information, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Agent Foundation Models: A Unified Approach to AI Problem Solving

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates