Teaching LLMs to Trust Context: The SI-FACT Framework

TLDR: SI-FACT is a novel framework that addresses knowledge conflict in Large Language Models (LLMs), where models often prioritize internal knowledge over provided context, leading to unfaithful responses. It uses a self-instruct mechanism for LLMs to automatically generate high-quality contrastive learning data (faithful and unfaithful examples). Through contrastive tuning, SI-FACT trains the model to distinguish and prioritize contextual information. Experiments show it significantly improves contextual recall and reduces reliance on internal memory with high data efficiency and minimal impact on general capabilities, offering a practical way to build more trustworthy LLMs.

Large Language Models (LLMs) have become incredibly powerful tools, driving innovation in many knowledge-intensive tasks. However, their application in critical areas like financial decision-making or medical diagnosis faces a significant hurdle: unfaithful generation. This occurs when an LLM provides information that contradicts the context it was given, often preferring its own internal, pre-trained knowledge over new, immediate information. This “knowledge conflict,” particularly between context and the model’s memory, can lead to serious consequences, as the model exhibits a form of stubbornness, failing to adapt dynamically to external knowledge.

Existing methods to tackle this problem have limitations. Inference-time interventions, such as prompt engineering, offer temporary fixes but don’t fundamentally address the model’s inherent biases. Traditional supervised fine-tuning, while capable of correcting unfaithful generation, is expensive, prone to overfitting, and can even lead to “catastrophic forgetting” of previously learned knowledge.

To overcome these challenges, a new training framework called Self-Improving Faithfulness-Aware Contrastive Tuning, or SI-FACT, has been proposed. Instead of viewing the LLM as a passive recipient needing external correction, SI-FACT reshapes it into an active, self-improving learner. The core idea is to enable the model to autonomously generate its own high-quality training data to enhance its contextual faithfulness, significantly reducing the need for costly manual annotation.

How SI-FACT Works: A Self-Improvement Loop

The SI-FACT framework operates through a self-improvement loop where the LLM acts as both a teacher and a student. This process translates the abstract concept of “faithfulness to context” into concrete signals that the model can learn and optimize within its internal representation space.

The process begins with Anchor Selection, where raw data triplets (Context, Question, Answer) are extracted from standard QA datasets like SQuAD. These serve as the foundation for learning.

Next is the Self-Instruct Data Generation Engine, the driving force behind SI-FACT. Using the base LLM itself as a teacher, specially designed prompts are used to automatically create a contrastive set for each anchor. This set includes:

Positive Samples: Rewritten versions of the golden answer that maintain factual accuracy but use different wording, helping the model learn semantic robustness.
Negative Samples: These are crucial and are designed to simulate unfaithful scenarios. There are three types:
1. Answers injected with external, unmentioned information (simulating hallucination).
2. Answers that directly conflict with the provided context.
3. Irrelevant answers that, while potentially based on some context, do not directly address the question.
By generating these structured negative samples, the model is forced to learn fine-grained discriminative abilities.

Following data generation, Contrastive Learning takes place. Here, the LLM becomes the student, trained using the self-generated contrastive dataset. The objective is to pull the representations of faithful responses (anchors and positive samples) closer together in the model’s internal space, while simultaneously pushing the representations of unfaithful responses (negative samples) farther apart. This is achieved using an InfoNCE loss function, which helps the model learn to distinguish between faithful and unfaithful answers at a fundamental level.

Finally, through Capability Internalization, the model integrates this enhanced contextual faithfulness, completing one cycle of self-improvement. In theory, an improved model can then generate even higher-quality data for subsequent cycles, creating a virtuous learning loop.

Impressive Results and High Efficiency

Experiments conducted on challenging knowledge conflict benchmarks, ECARE_KRE and COSE_KRE, demonstrated SI-FACT’s superior performance. The framework, based on Llama3-8B-Instruct, significantly improved the Contextual Recall Rate (CRR) – a measure of how often the model’s answers align with the given context. On the ECARE_KRE dataset, SI-FACT achieved a CRR of 75.97%, surpassing the best baseline by over 6.2%. It also achieved the lowest Parametric Recall Rate (PRR), indicating a strong suppression of the model’s tendency to rely on conflicting internal knowledge.

A key finding was SI-FACT’s remarkable data efficiency. The model’s performance peaked with only 1000 self-generated training samples, highlighting that targeted, actively generated data is far more effective than random or large-scale but less focused datasets. This makes SI-FACT a practical and scalable solution for enhancing LLM reliability even under resource constraints.

Furthermore, the SI-FACT framework was shown to largely preserve the model’s general capabilities across benchmarks for knowledge question answering, mathematical reasoning, commonsense reasoning, and complex question answering. Any minor performance trade-offs were deemed acceptable, suggesting a targeted optimization without significantly damaging the model’s existing knowledge system.

Also Read:

A Path Towards Trustworthy LLMs

The SI-FACT framework offers a promising solution to the critical problem of knowledge conflict in LLMs. By enabling models to self-generate high-quality contrastive data and learn to distinguish faithful from unfaithful responses, it significantly enhances contextual faithfulness. This approach provides a practical and scalable pathway toward building more proactive and trustworthy language models, especially for high-stakes applications. Future research aims to apply SI-FACT to larger models, integrate it with Retrieval-Augmented Generation (RAG) systems, and extend the self-improvement paradigm to other crucial capabilities like safety and bias mitigation. You can read the full research paper here: SI-FACT: Mitigating Knowledge Conflict via Self-Improving Faithfulness-Aware Contrastive Tuning.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Teaching LLMs to Trust Context: The SI-FACT Framework

How SI-FACT Works: A Self-Improvement Loop

Impressive Results and High Efficiency

A Path Towards Trustworthy LLMs

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates