CORE: Reinforcement Learning for Efficient and Accurate LLM Context Compression

TLDR: CORE is a novel method that uses reinforcement learning to achieve lossless context compression for Retrieval-Augmented Generation (RAG) in Large Language Models (LLMs). By optimizing the compression process based on end-task performance, CORE significantly reduces input length and computational costs while not only preventing performance degradation but also improving answer accuracy, demonstrating strong generalization across various datasets and LLMs.

Large Language Models (LLMs) have transformed how we interact with AI, demonstrating impressive capabilities in understanding and generating human-like text. However, these powerful models often struggle with staying up-to-date with the latest information and maintaining factual accuracy. This is where Retrieval-Augmented Generation (RAG) comes into play, a technique that enhances LLMs by allowing them to retrieve relevant documents from vast knowledge bases and use this information to inform their responses.

While RAG significantly boosts the performance of LLMs on knowledge-intensive tasks, it introduces a new challenge: the sheer volume of retrieved documents can make the input context excessively long. This leads to higher computational costs and can even make it difficult for the LLM to effectively utilize all the information, sometimes overlooking crucial details buried within the lengthy text.

Previous attempts to address this issue have focused on compressing these retrieved documents into shorter texts before feeding them to the LLM. However, many of these methods often compromise the accuracy of the final output. The main hurdle has been the lack of clear targets for what constitutes an “ideal” compressed summary, forcing many approaches to rely on fixed rules that don’t guarantee the compressed content will truly support the LLM’s task.

Introducing CORE: Lossless Compression with Reinforcement Learning

To overcome these limitations, researchers have developed CORE (COmpression via REinforcement learning), a novel method designed to achieve “lossless” context compression for RAG. Lossless here means that the compression doesn’t degrade the end-task performance of the LLM; in fact, CORE often improves it. This innovative approach leverages reinforcement learning (RL) to optimize the compression process without needing predefined compression labels.

At its heart, CORE uses the LLM’s end-task performance—specifically, the accuracy of its answers—as a reward signal. This signal guides the training of a dedicated compressor model. The training is implemented using a technique called Generalized Reinforcement Learning Policy Optimization (GRPO), which allows the compressor to learn how to generate summaries that maximize the accuracy of the answers produced by the LLM. This end-to-end training framework ensures that the compressor is goal-oriented, focusing on creating summaries that are most helpful for the LLM to provide accurate responses.

The CORE framework is designed to be efficient. The compressor model itself is intentionally much smaller than the main LLM, ensuring that the computational benefits of compression are not offset by a large, complex compressor. The training process also includes a “distillation warm-up” phase, where a very large language model acts as a teacher to provide an initial strong policy for the smaller compressor, ensuring stable and effective reinforcement learning.

Also Read:

Impressive Results and Generalization

Extensive experiments conducted on four benchmark datasets—Natural Questions, TriviaQA, HotpotQA, and 2WikiMultihopQA—demonstrate the significant superiority of CORE. With an impressive compression ratio, reducing the context to as little as 3% of its original length, CORE not only avoids any performance degradation compared to using full, uncompressed documents but also significantly improves the average Exact Match (EM) score by 3.3 points across all datasets. For instance, on the Natural Questions dataset, CORE achieved a 3.6% token usage compression ratio while improving Exact Match by 3.2 points compared to prepending ten full documents.

Furthermore, CORE exhibits strong generalization abilities. The framework is not dependent on a specific compressor architecture, meaning different models can be used to train the compressor with similar success. Crucially, a compressor trained with CORE can also be effectively transferred to different large language models (e.g., from Qwen2.5-14B-Instruct to LLaMA-3.1-8B-Instruct) without retraining, consistently outperforming baselines that use full documents. This indicates that the summaries generated by CORE are inherently high-quality and contain the essential information needed for accurate answering.

In essence, CORE represents a significant step forward in making RAG systems more efficient and effective. By intelligently compressing retrieved information, it allows LLMs to leverage vast knowledge bases without being overwhelmed by long contexts, ultimately leading to more accurate and timely responses. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

CORE: Reinforcement Learning for Efficient and Accurate LLM Context Compression

Introducing CORE: Lossless Compression with Reinforcement Learning

Impressive Results and Generalization

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Avalara Secures $500 Million Investment from BlackRock to Propel AI-Powered Tax Automation

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates