TLDR: Tongyi DeepResearch is an open-source agentic large language model designed for complex, long-horizon information-seeking research tasks. It uses an end-to-end training framework combining agentic mid-training and post-training, powered by a scalable, automated data synthesis pipeline and customized environments. Equipped with tools like Search and Google Scholar, it achieves state-of-the-art performance on various deep research benchmarks, including a “Heavy Mode” for enhanced accuracy. The project aims to democratize agentic intelligence and evolve towards general-purpose AI agents.
A new era of artificial intelligence research is dawning with the introduction of Tongyi DeepResearch, an innovative open-source agentic large language model. Developed by the Tongyi DeepResearch Team, this model is specifically engineered to tackle complex, long-duration information-seeking tasks, aiming to significantly enhance human intellectual productivity.
What is Tongyi DeepResearch?
Tongyi DeepResearch is an advanced AI agent designed to autonomously conduct multi-step reasoning and gather information from the internet for intricate research problems. Unlike many existing deep research systems that remain proprietary, Tongyi DeepResearch is open-source, making its framework and solutions accessible to the wider AI community. It boasts 30.5 billion total parameters, with a highly efficient activation of only 3.3 billion parameters per token.
How Does It Learn?
The model’s development is rooted in a unique, end-to-end training framework that combines “agentic mid-training” and “agentic post-training.” This approach allows the model to gradually develop from basic interaction skills to advanced autonomous research behaviors. Agentic mid-training provides the model with foundational agentic knowledge, bridging the gap between general pre-training and specialized agentic tasks. Agentic post-training further refines its capabilities through reinforcement learning, where the model interacts with environments and learns from reward signals based on the correctness of its answers.
A key innovation in its training is a fully automated and highly scalable data synthesis pipeline. This pipeline generates diverse, high-quality agent trajectories without relying on costly human annotation. It creates research-level questions, plans actions, generates reasoning chains, and models decision-making processes, ensuring that the model is exposed to a rich variety of problem-solving scenarios. The training also utilizes customized environments, ranging from “Prior World” (simulated based on pre-trained knowledge) to “Simulated” (controlled replicas of real-world interactions) and “Real-world” environments, each offering a different balance of stability, fidelity, and cost.
Tools and Interaction
Tongyi DeepResearch is equipped with a versatile set of tools to interact with its environment, including Search, Visit (for web pages), Python Interpreter, Google Scholar, and File Parser. These tools enable it to perform comprehensive information gathering and analysis. The model’s interaction process is based on the ReAct framework, which interleaves thought and action, allowing for dynamic reasoning and execution. It also incorporates a context management paradigm to handle long-horizon tasks efficiently, preventing context overflow by focusing on essential information at each step.
Impressive Performance
Empirical evaluations show that Tongyi DeepResearch achieves state-of-the-art performance across a range of agentic deep research benchmarks. It outperforms strong baselines like OpenAI o3 and Deepseek-V3.1 on benchmarks such as Humanity’s Last Exam, BrowseComp, BrowseComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This demonstrates its effectiveness in both English and Chinese tasks and its strong generalization capabilities.
The research also introduces a “Heavy Mode,” which further enhances performance by leveraging test-time scaling through a Research-Synthesis framework. This mode deploys multiple parallel agents to explore diverse solution paths, and then a synthesis model consolidates these findings to produce a final, more robust answer. This approach significantly improves accuracy on challenging benchmarks.
Also Read:
- DecoupleSearch: Enhancing AI Reasoning by Separating Planning and Information Retrieval
- FunReason-MT: Enhancing AI’s Ability to Use Tools in Complex Conversations
Future Outlook
The Tongyi DeepResearch team is committed to advancing deep research agents. They envision evolving from domain-specific agents to general-purpose agents capable of autonomous reasoning, planning, and action across diverse domains with minimal human supervision. This work represents a significant step towards AI systems that can autonomously transform information into insight, empowering individuals and organizations with enhanced productivity and innovation. For more technical details, you can refer to the original research paper. Read the full report here.


