A2FM: The Adaptive AI Model That Thinks Smarter, Not Harder

TLDR: A2FM is a new AI model that unifies reasoning and tool-using capabilities of large language models. It introduces an ‘instant’ mode for simple queries, alongside agentic (tool-aware) and reasoning modes. Through Adaptive Policy Optimization, A2FM intelligently routes tasks to the most appropriate mode, achieving state-of-the-art accuracy across various benchmarks while significantly cutting computational costs by adapting its ‘thinking’ process to task complexity.

Large language models (LLMs) have shown incredible capabilities, but they often fall into two distinct categories: those excellent at deep, internal reasoning (like chain-of-thought models) and those skilled at interacting with external tools and environments (known as agentic models). This division means that a single LLM often struggles to be both deeply thoughtful and highly practical, leading to inefficiencies, especially on simple tasks where models might “overthink” or unnecessarily call tools.

Introducing A2FM: The Adaptive Agent Foundation Model

A new framework called A2FM, or Adaptive Agent Foundation Model, aims to bridge this gap. Developed by the OPPO AI Agent Team, A2FM unifies these different strengths by following a “route-then-align” principle. This means the model first learns to understand the nature of a task and then aligns its approach based on that understanding, all while operating under a shared core system.

To tackle the problem of inefficiency, A2FM introduces a clever third mode: the “instant” mode. This mode is designed to handle simple queries directly, preventing the model from engaging in unnecessary complex reasoning or tool interactions. This complements the existing agentic (tool-using) and reasoning (deep thinking) modes, creating a more balanced and efficient system.

How A2FM Learns to Adapt

A2FM’s ability to jointly enhance accuracy and efficiency comes from a novel training method called Adaptive Policy Optimization (APO). APO uses a cost-regularized reward system and adaptive sampling across its three modes. This allows the model to learn when to use which mode, favoring quick, instant solutions for easy questions and escalating to more complex reasoning or tool-use when a task truly demands it.

The model’s architecture includes a self-adaptive router that decides “what to do” for each query. For tasks requiring external information or code execution, it can activate the agentic mode, which uses tools like web search (via SerpAPI), web crawling (via Jina API and summarized by gpt-5-mini), and code execution (in an isolated environment using nsjail). For complex logical problems, it switches to the reasoning mode, generating detailed step-by-step thoughts. And for straightforward questions, the instant mode provides direct answers.

Also Read:

Impressive Performance and Cost Savings

Evaluated at the 32B scale, A2FM has achieved state-of-the-art results across a wide range of benchmarks. On agentic tasks, it scored 13.4% on BrowseComp, 70.4% on AIME25 for reasoning tasks, and 16.7% on HLE for general tasks. These scores not only set new records among comparable models but also show competitive performance against leading LLMs across agentic, reasoning, and general benchmarks.

One of A2FM’s most notable achievements is its significant cost efficiency. On the SuperGPQA benchmark, the adaptive execution achieved a “cost of pass” of only $0.00487 per correct answer. This represents a substantial reduction in cost—45.2% less than using only the reasoning mode and 33.5% less than using only the agentic mode—while maintaining comparable accuracy. This means A2FM delivers correct answers at roughly half the cost of traditional reasoning-based execution.

The model’s efficiency is further highlighted by its adaptive routing. For instance, on easy questions in SuperGPQA, A2FM used the instant mode for 61.1% of queries, but this dropped to just 5.3% for difficult ones, demonstrating its intelligent allocation of resources based on task complexity. The accuracy of instant responses remained stable at around 55% across all difficulty levels, proving its robustness.

In conclusion, A2FM represents a significant step forward in developing more versatile and efficient AI agents. By integrating instant, reasoning, and agentic modes under a single, adaptively routed backbone, it offers a scalable path towards LLMs that are both highly accurate and remarkably cost-effective. You can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

A2FM: The Adaptive AI Model That Thinks Smarter, Not Harder

Introducing A2FM: The Adaptive Agent Foundation Model

How A2FM Learns to Adapt

Impressive Performance and Cost Savings

Gen AI News and Updates

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Sulava, The Digital Neighborhood’s AI Pioneer, Crowned Microsoft’s Global Partner of the Year for Copilot and AI Agents

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates