Giving LLMs a 'Silent Reading' Phase for Better Reasoning

TLDR: A new paper introduces READQ and READQBUDDY, techniques that enable Large Language Models (LLMs) to “read quietly” and deeply comprehend input before generating responses. READQ masks initial training loss, allowing models to internalize context without penalty, while READQBUDDY uses an auxiliary module for continuous contextual understanding. These methods significantly improve LLM accuracy and reasoning across various benchmarks by decoupling comprehension from response generation, mimicking human cognition.

Large Language Models (LLMs) have shown incredible ability in understanding text and creating high-quality responses. However, a key difference from how humans think is that LLMs typically don’t have a separate internal “reading” or thinking phase before they start generating text. Humans often read silently to understand the context and form thoughts before speaking.

A new research paper, titled “Read Quietly, Think Aloud: Decoupling Comprehension and Reasoning in LLMs,” explores methods to give LLMs a similar capacity for internal processing. The authors, Yuanxin Wang and Ganesh Venkatesh from AppliedML, Cerebras, highlight that while much recent work focuses on improving how LLMs “think aloud” (like Chain-of-Thought prompting), less attention has been given to the crucial initial step of comprehending the input.

Introducing READQ: Silent Reading for LLMs

The paper introduces a straightforward technique called Read Quietly (READQ). This method modifies the training process by creating a “silent reading” window at the beginning of a sequence. Specifically, for the first few tokens of an input, the model is not penalized for its predictions. This means the standard next-token prediction loss is not calculated for these initial tokens.

This approach offers two main benefits. First, it avoids training the model on high-variance, context-poor initial tokens, which are inherently difficult to predict and can lead to noisy learning. Second, and more importantly, this pressure-free window gives the model an opportunity to develop an ability to “read quietly.” It allows the LLM to build a more robust internal understanding of the context before it starts generating a response.

Enhancing Comprehension with READQBUDDY

While READQ primarily helps with the initial phase, long and complex inputs can still benefit from continuous understanding. To address this, the researchers propose READQBUDDY, an architectural enhancement. This involves an auxiliary “buddy” module that reads the entire input context in parallel. This “buddy” processes the information and provides a refined semantic representation of the context to the primary generation model at each step.

This ensures that the core insights gained during the initial “silent reading” phase are not lost and can inform the entire reasoning and response-generation process, from the very first token to the last. In their implementation, the “buddy” is itself an LLM model, and its output embeddings are combined with the main model’s input.

Also Read:

Promising Results Across Benchmarks

The empirical validation of READQ and READQBUDDY shows consistent performance improvements. Experiments on the Llama 3.2 3B Instruct model demonstrated significant gains across various benchmarks, including ARC Challenge, HellaSwag, OpenBookQA, PubMedQA, and Winogrande. For instance, READQ boosted accuracy on ARC Challenge from 37.2 to 45.82, and READQBUDDY further improved it to 49.06.

The benefits also scaled to larger models. When evaluated on a Llama 3.1 70B model trained on scientific domain data, READQ showed consistent performance gains. Notably, it achieved an 8 percentage point jump in accuracy on the MedQA task, highlighting its potential in complex reasoning scenarios.

The researchers believe that combining this foundational “reading” phase with the advanced “thinking” capabilities of state-of-the-art models offers a promising path for the future of artificial intelligence. You can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Giving LLMs a ‘Silent Reading’ Phase for Better Reasoning

Introducing READQ: Silent Reading for LLMs

Enhancing Comprehension with READQBUDDY

Promising Results Across Benchmarks

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates