FM Agent: A New AI Framework for Autonomous Scientific Discovery and Optimization

TLDR: The FM Agent is a novel multi-agent AI framework that combines large language model reasoning with evolutionary search to autonomously solve complex real-world problems. It features cold-start initialization, adaptive sampling, domain-specific evaluators, and a distributed infrastructure. The system has achieved state-of-the-art results across diverse domains including machine learning, combinatorial optimization, GPU kernel generation, and mathematics, demonstrating its potential to accelerate scientific and engineering discovery without human intervention.

A new multi-agent framework, dubbed FM Agent, is set to transform how complex scientific and engineering challenges are tackled. Developed by the FM Agent Team at Baidu AI Cloud, this innovative system combines the powerful reasoning capabilities of large language models (LLMs) with large-scale evolutionary search to autonomously discover and refine solutions for real-world problems.

The core of FM Agent integrates several key innovations designed to enhance its performance and scalability. Firstly, a ‘cold-start initialization’ phase incorporates expert guidance to generate a diverse and high-quality initial set of solutions, ensuring the evolutionary search begins from a strong foundation. Secondly, an ‘adaptive diversity-driven sampling’ strategy optimizes the search process by balancing exploration of new ideas with the exploitation of promising ones, using multiple parallel evolutionary ‘islands’. Thirdly, ‘domain-specific evaluators’ provide nuanced feedback by considering correctness, effectiveness, and LLM-supervised quality assessments, guiding the system’s iterative improvements. Finally, a ‘distributed, asynchronous execution infrastructure’ built on Ray allows for efficient, large-scale concurrent evaluation across many computing resources, making the system highly scalable.

FM Agent has demonstrated broad applicability across diverse fields, showcasing its potential to accelerate innovation and automate complex discovery processes. In machine learning, it revolutionizes workflows by autonomously performing feature mining, intelligently combining features, adaptively fusing models, and solving end-to-end ML tasks. This means it can independently design and refine machine learning solutions, reducing the need for extensive human expertise.

For combinatorial optimization, which involves finding optimal objects from vast sets of possibilities, FM Agent can autonomously design novel problem-solving strategies (heuristics), enhance existing optimization software, and directly generate high-quality solutions for problems that are traditionally computationally prohibitive. This is crucial for applications like production scheduling and resource management.

In the realm of GPU kernel optimization, a notoriously difficult task due to complex hardware interactions, FM Agent reformulates it as an autonomous, data-driven process. By generating diverse code candidates, evaluating their performance, and using feedback, it continuously optimizes CUDA kernels, achieving significant speedups over traditional methods.

Even in mathematics, FM Agent proves its versatility. Tasks such as theorem proving, inequality tightening, and bound estimation can be reframed as search problems. The system combines symbolic reasoning, numerical experimentation, and evolutionary exploration to iteratively refine solutions, often uncovering unexpected theoretical insights.

The framework operates in two main stages: a Cold Start Stage to build an initial diverse population of solutions, and an Evolve Stage where these solutions are iteratively improved through mutation and crossover mechanisms within a multi-population ‘island model’. This entire process is supported by a scalable distributed infrastructure and can optionally incorporate human-interactive feedback for real-time monitoring and knowledge integration.

Experiments have shown FM Agent achieving state-of-the-art results autonomously, without human intervention or tuning. It scored 1976.3 on ALE-Bench (+5.2%), 43.56% on MLE-Bench (+4.0pp), and delivered up to 20 times speedups on KernelBench. It also established new state-of-the-art results on several classical mathematical problems, outperforming previous methods like AlphaEvolve in tasks such as circle packing, uncertainty inequality, and ratio minimization in 2D space.

Also Read:

The FM Agent represents a significant step towards autonomous AI research agents, promising substantial engineering and scientific advances with broader societal impact. For more details, you can refer to the full research paper.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

FM Agent: A New AI Framework for Autonomous Scientific Discovery and Optimization

Gen AI News and Updates

U.S. Air Force Secures Skydio Drone Technology for Enhanced Autonomous Operations

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

Beyond Digital: Exploring the Fundamentals of Physical Artificial Intelligence

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates