DecoupleSearch: Enhancing AI Reasoning by Separating Planning and Information Retrieval

TLDR: DecoupleSearch is a novel Agentic RAG framework that improves Large Language Model (LLM) performance by decoupling planning and search processes using dual value models. It employs Monte Carlo Tree Search (MCTS) for evaluating reasoning steps and Hierarchical Beam Search for efficient exploration and pruning of candidate plans and search results. This approach leads to more accurate and reliable answers, especially for complex, multi-step reasoning tasks, and allows smaller LLMs to achieve competitive performance.

Large Language Models (LLMs) have shown incredible capabilities across many tasks, but they sometimes struggle with generating factual information, leading to what’s known as ‘hallucinations’. To combat this, Retrieval-Augmented Generation (RAG) systems integrate external knowledge, allowing LLMs to pull in verifiable information and improve accuracy.

A more advanced approach, Agentic RAG, introduces autonomous AI agents into this process. These agents can plan their reasoning steps and then search for relevant information iteratively until they reach a final answer. However, Agentic RAG faces its own set of hurdles: the quality of each step relies heavily on both good planning and accurate searching, there’s often a lack of feedback for intermediate reasoning steps, and the sheer number of possible plans and searches can create an overwhelmingly large space to explore.

Introducing DecoupleSearch

To tackle these challenges, researchers have proposed a new framework called DecoupleSearch. This innovative approach separates the planning and search processes by using two distinct ‘value models’. This separation allows for independent optimization of how the agent thinks (planning) and how it finds information (searching).

DecoupleSearch builds a ‘reasoning tree’ where each point represents a planning or search action. To evaluate the quality of each step, it uses a technique called Monte Carlo Tree Search (MCTS). During the actual use of the system, a method called Hierarchical Beam Search is employed. This iteratively refines potential plans and search candidates, guided by the dual value models, to find the best path to an answer.

How DecoupleSearch Works

The core idea is to enhance the probability of success at each reasoning step. DecoupleSearch introduces phases for ‘planning exploration’ and ‘search exploration’. The agent first generates several possible plans, which are then evaluated by a ‘planning value model’ to pick the most promising ones. Based on these selected plans, the agent generates multiple search queries to retrieve documents. These search results are then ranked by a ‘search value model’ to ensure the retrieved information is reliable.

MCTS plays a crucial role in efficiently assessing the quality of each reasoning step. During simulations, the LLM itself acts as a judge, evaluating the quality of both planning and search results separately. Rewards from the correctness of the final answer are then used to refine the LLM’s internal scores, correcting any potential inaccuracies. To manage the vast number of possibilities, the planning and search spaces are pruned using these value models, which are trained on signals derived from the MCTS annotation process.

During inference, Hierarchical Beam Search ensures a thorough exploration. At each step, multiple plans are generated and filtered by the planning value model. Then, based on the best plans, search queries are created, and the retrieved documents are evaluated by the search value model to keep only the most valuable ones. This iterative process continues until a final answer is reached.

Also Read:

Key Advantages and Findings

The DecoupleSearch framework offers several significant contributions. It decouples planning and search with dual value models, allowing for independent optimization. It also improves the success rate of each step by fully exploring planning and search spaces, using MCTS for accurate assessment and Hierarchical Beam Search for efficient pruning.

Extensive experiments across various question-answering datasets and different LLM sizes (like Qwen2.5-7B-Instruct and Qwen2.5-14B-Instruct) have demonstrated the effectiveness of this method. DecoupleSearch consistently outperformed existing baselines, showing a notable relative average improvement. Interestingly, the performance of DecoupleSearch with a smaller 7B model became comparable to that of a larger 14B model when Hierarchical Beam Search was applied, suggesting that inference-time scaling techniques can help smaller models achieve competitive results.

An ablation study revealed that both planning and search exploration are vital, with planning exploration having a more significant impact. This is because a good plan sets the stage for effective searching. The study also explored the impact of ‘planning expansion size’ and ‘search expansion size’ hyperparameters. While a planning expansion size of around 3 seemed optimal, larger search expansion sizes generally led to better performance, as evaluating search results is often more straightforward than evaluating abstract plans.

The effectiveness of the value models was also confirmed; using them to rank plans and search results consistently outperformed random selection. This highlights their ability to accurately gauge the quality of intermediate steps.

For a deeper dive into the technical details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

DecoupleSearch: Enhancing AI Reasoning by Separating Planning and Information Retrieval

Introducing DecoupleSearch

How DecoupleSearch Works

Key Advantages and Findings

Gen AI News and Updates

Progress Software Unveils Groundbreaking Generative CMS with Trusted AI for Dynamic Digital Experiences

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

FaithAct: A Framework for Verifying AI’s Visual Reasoning Steps

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates