TLDR: HAVE (Head-Adaptive Gating and ValuE Calibration) is a new, parameter-free decoding framework designed to mitigate hallucinations in Large Language Models (LLMs). It addresses issues where attention heads are treated as static and raw attention weights poorly reflect token contributions. HAVE introduces head-adaptive gating to dynamically reweight attention heads and value calibration to augment attention with value vector magnitudes, creating more accurate token-level evidence. This evidence is then fused with the LLM’s distribution, especially when the model is uncertain. Experiments show HAVE consistently reduces hallucinations and outperforms strong baselines across multiple QA benchmarks and LLM families without requiring finetuning.
Large Language Models (LLMs) have become incredibly powerful, but they often suffer from a significant problem: hallucinations. This means they can generate information that sounds plausible but isn’t factual, especially when drawing from external sources or handling long texts. This issue arises because LLMs sometimes treat the importance of different parts of their attention mechanism as fixed, and the raw attention weights don’t always accurately show how much each piece of information truly contributes to the output.
Introducing HAVE: A New Approach to Combat Hallucinations
A new research paper introduces HAVE, which stands for Head-Adaptive Gating and ValuE Calibration. This is a clever, parameter-free framework designed to directly tackle the problem of hallucinations. The beauty of HAVE is that it doesn’t require any additional training or fine-tuning of the LLM, and it works efficiently in a single pass, making it easy to integrate with existing models.
How HAVE Works Its Magic
HAVE operates through two main components:
Head-Adaptive Gating (HAG): Imagine an LLM’s attention mechanism as having many different ‘heads,’ each focusing on different aspects of the input. Traditionally, these heads might be given a fixed level of importance. HAG changes this by dynamically adjusting the importance of each attention head based on the specific input it’s processing. This means that heads that are more relevant to the current context get more weight, while noisy or less relevant ones are downplayed, but never completely ignored, ensuring stability.
Value Calibration (VC): The raw attention weights in an LLM don’t always tell the full story of how much a piece of information truly influences the model’s output. VC addresses this by enhancing attention signals with information from ‘value vectors,’ which are crucial for how the model updates its internal state. It also helps to filter out ‘sink tokens’—like special characters or whitespace—that might otherwise disproportionately influence the attention. By doing this, VC creates a more accurate picture of which tokens are genuinely contributing to the model’s predictions.
These two modules work together to build a reliable ‘evidence’ signal from the model’s internal workings. This evidence is then carefully blended with the LLM’s natural prediction distribution using a smart policy that considers how uncertain the model is. If the model is very uncertain, HAVE’s evidence has a stronger influence, guiding it towards more factual outputs.
Impressive Results Across the Board
The researchers put HAVE to the test across various question-answering benchmarks, including HotpotQA, SearchQA, SQuAD, Natural Questions, and NQ-Swap, using popular LLMs like LLaMA2-7B-Chat, LLaMA2-13B-Chat, and Mistral-7B-Instruct. The results were consistently positive. HAVE significantly reduced hallucinations and outperformed other strong methods, including DAGCD, in most scenarios. It showed particular strength in tasks requiring complex reasoning over long and diverse contexts, and even demonstrated robustness when faced with conflicting information.
An in-depth analysis confirmed that both Head-Adaptive Gating and Value Calibration are essential and work together to achieve these improvements. The framework also proved to be robust to different settings of its key parameters, making it practical for real-world use.
Also Read:
- HALT-RAG: A Smart Approach to Verifying AI-Generated Content
- Improving LLM Reliability with Graph-Enhanced Uncertainty Estimation
A Step Towards More Trustworthy LLMs
HAVE represents a significant advancement in making LLMs more reliable and trustworthy. By offering a transparent, reproducible, and adaptable solution that doesn’t require extensive retraining, it paves the way for more accurate and factual generation in applications like retrieval-augmented systems and long-context understanding. For more details, you can read the full research paper here.


