TLDR: The FM Agent is a novel multi-agent AI framework that combines large language model reasoning with evolutionary search to autonomously solve complex real-world problems. It features cold-start initialization, adaptive sampling, domain-specific evaluators, and a distributed infrastructure. The system has achieved state-of-the-art results across diverse domains including machine learning, combinatorial optimization, GPU kernel generation, and mathematics, demonstrating its potential to accelerate scientific and engineering discovery without human intervention.
A new multi-agent framework, dubbed FM Agent, is set to transform how complex scientific and engineering challenges are tackled. Developed by the FM Agent Team at Baidu AI Cloud, this innovative system combines the powerful reasoning capabilities of large language models (LLMs) with large-scale evolutionary search to autonomously discover and refine solutions for real-world problems.
The core of FM Agent integrates several key innovations designed to enhance its performance and scalability. Firstly, a ‘cold-start initialization’ phase incorporates expert guidance to generate a diverse and high-quality initial set of solutions, ensuring the evolutionary search begins from a strong foundation. Secondly, an ‘adaptive diversity-driven sampling’ strategy optimizes the search process by balancing exploration of new ideas with the exploitation of promising ones, using multiple parallel evolutionary ‘islands’. Thirdly, ‘domain-specific evaluators’ provide nuanced feedback by considering correctness, effectiveness, and LLM-supervised quality assessments, guiding the system’s iterative improvements. Finally, a ‘distributed, asynchronous execution infrastructure’ built on Ray allows for efficient, large-scale concurrent evaluation across many computing resources, making the system highly scalable.
FM Agent has demonstrated broad applicability across diverse fields, showcasing its potential to accelerate innovation and automate complex discovery processes. In machine learning, it revolutionizes workflows by autonomously performing feature mining, intelligently combining features, adaptively fusing models, and solving end-to-end ML tasks. This means it can independently design and refine machine learning solutions, reducing the need for extensive human expertise.
For combinatorial optimization, which involves finding optimal objects from vast sets of possibilities, FM Agent can autonomously design novel problem-solving strategies (heuristics), enhance existing optimization software, and directly generate high-quality solutions for problems that are traditionally computationally prohibitive. This is crucial for applications like production scheduling and resource management.
In the realm of GPU kernel optimization, a notoriously difficult task due to complex hardware interactions, FM Agent reformulates it as an autonomous, data-driven process. By generating diverse code candidates, evaluating their performance, and using feedback, it continuously optimizes CUDA kernels, achieving significant speedups over traditional methods.
Even in mathematics, FM Agent proves its versatility. Tasks such as theorem proving, inequality tightening, and bound estimation can be reframed as search problems. The system combines symbolic reasoning, numerical experimentation, and evolutionary exploration to iteratively refine solutions, often uncovering unexpected theoretical insights.
The framework operates in two main stages: a Cold Start Stage to build an initial diverse population of solutions, and an Evolve Stage where these solutions are iteratively improved through mutation and crossover mechanisms within a multi-population ‘island model’. This entire process is supported by a scalable distributed infrastructure and can optionally incorporate human-interactive feedback for real-time monitoring and knowledge integration.
Experiments have shown FM Agent achieving state-of-the-art results autonomously, without human intervention or tuning. It scored 1976.3 on ALE-Bench (+5.2%), 43.56% on MLE-Bench (+4.0pp), and delivered up to 20 times speedups on KernelBench. It also established new state-of-the-art results on several classical mathematical problems, outperforming previous methods like AlphaEvolve in tasks such as circle packing, uncertainty inequality, and ratio minimization in 2D space.
Also Read:
- FunReason-MT: Enhancing AI’s Ability to Use Tools in Complex Conversations
- Agentic Parallel Thinking: Boosting Information Seeking with PARALLELMUSE
The FM Agent represents a significant step towards autonomous AI research agents, promising substantial engineering and scientific advances with broader societal impact. For more details, you can refer to the full research paper.


