spot_img
HomeResearch & DevelopmentPRISM: A New Agentic Approach to Smarter Information Retrieval...

PRISM: A New Agentic Approach to Smarter Information Retrieval for Complex Questions

TLDR: PRISM is an agentic retrieval framework that uses Large Language Models (LLMs) to improve multi-hop question answering. It features three specialized agents: a Question Analyzer to break down complex questions, a Selector to filter for precision, and an Adder to ensure recall by recovering missing evidence. This iterative process creates compact and comprehensive evidence sets, leading to significantly higher retrieval accuracy and improved end-to-end QA performance across various benchmarks, outperforming existing methods and mitigating LLM limitations like ‘lost-in-the-middle’ and hallucination.

In the rapidly evolving field of artificial intelligence, answering complex questions accurately remains a significant challenge, especially when information needs to be gathered from multiple sources. This is known as multi-hop question answering (QA). A new research paper introduces an innovative solution called PRISM, which stands for Precision–Recall Iterative Selection Mechanism. This framework aims to enhance how large language models (LLMs) retrieve information, making the process more precise and comprehensive.

The core idea behind PRISM is to use a system of specialized AI agents that work together in a structured loop. This agentic retrieval system is designed to overcome common limitations of LLMs, such as the ‘lost-in-the-middle’ phenomenon, where crucial information in long texts is overlooked, and the tendency to ‘hallucinate’ or generate incorrect information when context is incomplete or noisy.

How PRISM Works: The Three Agents

PRISM employs three distinct LLM-based agents, each with a specific role:

1. Question Analyzer Agent: This agent is the first step. It takes a complex multi-hop question and breaks it down into smaller, more manageable sub-questions. For example, if asked, “Which painter who shared a house with Vincent van Gogh was married to a Danish ceramist?”, the Analyzer would decompose it into sub-questions like “Who shared a house with van Gogh?” and “Who was that person married to?”. This decomposition helps in focusing the search and ensuring no critical piece of information is missed.

2. Selector Agent: After initial retrieval, many passages might seem relevant but are actually distractors. The Selector agent acts as a precision-focused filter. Its job is to meticulously review the candidate evidence and remove any passages that are definitely irrelevant to the sub-questions. This ensures that the downstream QA model receives a clean, compact set of highly relevant information, reducing noise and mitigating the risk of hallucinations.

3. Adder Agent: While the Selector focuses on precision, overly strict filtering can sometimes lead to missing crucial, complementary facts. The Adder agent is designed to address this by prioritizing recall. It re-examines the evidence that the Selector left behind and adds any missing pieces that are essential for completing the reasoning chain. This could include bridging facts that connect entities across different documents or filling logical gaps. The Selector and Adder agents work in an iterative loop, refining the evidence set until it is both compact and complete.

This iterative refinement loop, where the Selector prunes for precision and the Adder expands for recall, is a key innovation. It ensures that the final set of supporting passages is not only highly relevant but also comprehensive enough to answer multi-hop questions accurately.

Answering the Question: The Answer Generator

Once the Question Analyzer, Selector, and Adder agents have collaboratively constructed a compact and comprehensive set of supporting evidence, this refined context is passed to an Answer Generator agent. This agent, also an LLM, then uses the provided evidence to generate the final answer to the original complex question. The researchers implemented this in a zero-shot setting, meaning the LLM was not specifically fine-tuned for the task, allowing for a direct assessment of how improved retrieval quality impacts the final answer accuracy.

Also Read:

Performance and Impact

Experiments conducted on several multi-hop QA benchmarks, including HotpotQA, 2WikiMultiHopQA, MuSiQue, and MultiHopRAG, demonstrated that PRISM consistently outperforms strong baseline methods. For instance, on HotpotQA, PRISM achieved a recall of 90.9% compared to 61.5% for a single-pass retriever and 72.8% for IRCoT, another advanced retrieval method. Similar significant gains were observed across other datasets, particularly on the challenging MuSiQue benchmark.

The improved retrieval quality directly translated into stronger end-to-end question answering performance. PRISM achieved state-of-the-art accuracy on HotpotQA, MuSiQue, and MultiHopRAG, and remained highly competitive on 2WikiMultiHopQA. This highlights that providing LLMs with compact, comprehensive, and noise-free evidence is crucial for their reasoning capabilities.

The framework also showed robustness across different LLMs, including GPT-4o, Gemini-2.5-Flash-Lite, and DeepSeek. While absolute scores varied, the precision-recall balancing mechanism consistently delivered high recall and competitive QA accuracy, indicating that PRISM’s design is not tied to a specific LLM architecture.

In conclusion, PRISM represents a significant step forward in multi-hop question answering. By treating retrieval as an active, agent-driven process that collaborates with the QA model, it provides a principled way to build more reliable and reasoning-centric retrieval systems. For more technical details, you can refer to the original research paper.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -