TLDR: A new research paper introduces a reinforcement learning framework that uses an “Entity Hallucination Index” (EHI) to reduce factual errors in AI-generated summaries. EHI quantifies the correctness and grounding of named entities, allowing models to be fine-tuned without human annotations. Experiments show this method significantly reduces entity-level hallucinations, improving summary reliability and factual accuracy.
Abstractive summarization models, powered by large language models (LLMs), have achieved impressive results in various fields. However, a persistent challenge known as “hallucination” remains. This occurs when generated summaries include incorrect or fabricated information that is not present in the original source input. Such inaccuracies, especially involving named entities, can significantly undermine the trustworthiness and utility of summaries in critical applications like meeting summarization, medical reporting, or financial documentation.
Existing methods for detecting hallucinations often rely on coarse-grained factuality metrics or require reference summaries, which limits their scalability. While some recent efforts have explored lightweight automatic metrics, directly integrating these evaluations into model training has been largely underexplored.
A new research paper, titled “Reducing Hallucinations in Summarization via Reinforcement Learning with Entity Hallucination Index”, introduces a novel approach to tackle this problem. The authors, Praveenkumar Katwe, Rakesh Chandra Balabantaray, and Kali Prasad Vittala, propose a reward-driven fine-tuning framework that explicitly optimizes for an “Entity Hallucination Index” (EHI).
Understanding the Entity Hallucination Index (EHI)
The EHI is a metric designed to quantify the presence, correctness, and grounding of named entities within generated summaries. Unlike traditional metrics, EHI does not rely on human-written factuality annotations, making the fine-tuning process scalable. The index is formulated to reward desirable behaviors and penalize undesirable ones:
- Positive Hallucination (PH): Measures newly introduced entities that are factually correct and beneficial.
- Extractiveness Factor (EF): Measures entities accurately extracted from the input document into the summary.
- Negative Hallucination (NH): Captures hallucinated entities that are incorrect or not grounded in the input.
- Overfocused Relations (OF): Penalizes summaries that overly focus on a narrow subset of entities, missing diversity.
- Lost Focus (LF): Penalizes summaries that omit important entities present in the input.
Crucially, a higher EHI score indicates better entity faithfulness and a reduction in harmful hallucinations. The paper clarifies that while the name might suggest otherwise, EHI functions as a precision-weighted reward, where a higher score means more helpful entity alignment.
The Fine-Tuning Approach
The methodology involves several steps. First, baseline summaries are generated using a pre-trained language model, such as Flan-T5-Large. Then, EHI scores are computed via automatic entity extraction and matching. Finally, reinforcement learning is applied to fine-tune the model parameters, using the EHI as a direct reward signal. This process biases the model toward generating summaries that are more faithful to the entities in the original text.
The researchers used meeting transcript datasets for their experiments, which included multi-turn conversational dialogues and abstractive gold summaries. Entity extraction was performed using a named entity recognition (NER) model from spaCy, with case-insensitive matching at the entity string level.
Key Findings and Improvements
Experiments demonstrated consistent improvements in EHI across datasets. Qualitative analysis revealed a significant reduction in entity-level hallucinations without degrading the fluency or informativeness of the summaries. Before fine-tuning, EHI scores were volatile and often low, indicating frequent hallucinations. After fine-tuning, EHI scores became more consistent, largely stabilizing between 0.3 and 0.6, suggesting improved control over hallucinated entities.
Entity F1 scores, which measure the precision and recall of entity prediction, also improved markedly. Initial F1 scores were often below 0.5, but fine-tuning led to many samples achieving values close to 1.0, reflecting high accuracy and consistency in entity prediction. The study also observed a stronger inverse correlation between EHI and Entity F1 after fine-tuning, meaning as hallucinations decreased, entity prediction accuracy increased.
The fine-tuned models showed improved entity grounding, with summaries better aligning with the input and correctly preserving mentioned organizations, speaker names, and events. Entity mentions became more precise and contextually appropriate. The hallucination behavior, which was previously erratic, stabilized substantially after training with EHI rewards.
Also Read:
- Enhancing AI Reliability: A Framework for Smarter Information Handling in Language Models
- Unmasking AI Hallucinations: Why the First Untruthful Token Stands Out
Limitations and Future Work
Despite the significant improvements, occasional errors persisted, especially for rare or ambiguous entity mentions. In some cases, the model overly prioritized exact entity copying, potentially at the expense of paraphrasing or abstraction quality. The authors note that EHI currently lacks a mechanism for detecting and handling “Relation Hallucination,” where relationships between entities might be incorrect. This suggests a trade-off between strict entity grounding and higher-level semantic fluency that warrants further investigation.
The researchers have released a reproducible Colab pipeline to facilitate further research on hallucination-aware model fine-tuning using lightweight metrics like EHI. For more details, you can read the full paper here.
In future work, the team plans to extend EHI-based fine-tuning to multi-document summarization and explore its integration with controllable generation frameworks, further enhancing the reliability of AI-generated content.


