TLDR: SI-FACT is a novel framework that addresses knowledge conflict in Large Language Models (LLMs), where models often prioritize internal knowledge over provided context, leading to unfaithful responses. It uses a self-instruct mechanism for LLMs to automatically generate high-quality contrastive learning data (faithful and unfaithful examples). Through contrastive tuning, SI-FACT trains the model to distinguish and prioritize contextual information. Experiments show it significantly improves contextual recall and reduces reliance on internal memory with high data efficiency and minimal impact on general capabilities, offering a practical way to build more trustworthy LLMs.
Large Language Models (LLMs) have become incredibly powerful tools, driving innovation in many knowledge-intensive tasks. However, their application in critical areas like financial decision-making or medical diagnosis faces a significant hurdle: unfaithful generation. This occurs when an LLM provides information that contradicts the context it was given, often preferring its own internal, pre-trained knowledge over new, immediate information. This “knowledge conflict,” particularly between context and the model’s memory, can lead to serious consequences, as the model exhibits a form of stubbornness, failing to adapt dynamically to external knowledge.
Existing methods to tackle this problem have limitations. Inference-time interventions, such as prompt engineering, offer temporary fixes but don’t fundamentally address the model’s inherent biases. Traditional supervised fine-tuning, while capable of correcting unfaithful generation, is expensive, prone to overfitting, and can even lead to “catastrophic forgetting” of previously learned knowledge.
To overcome these challenges, a new training framework called Self-Improving Faithfulness-Aware Contrastive Tuning, or SI-FACT, has been proposed. Instead of viewing the LLM as a passive recipient needing external correction, SI-FACT reshapes it into an active, self-improving learner. The core idea is to enable the model to autonomously generate its own high-quality training data to enhance its contextual faithfulness, significantly reducing the need for costly manual annotation.
How SI-FACT Works: A Self-Improvement Loop
The SI-FACT framework operates through a self-improvement loop where the LLM acts as both a teacher and a student. This process translates the abstract concept of “faithfulness to context” into concrete signals that the model can learn and optimize within its internal representation space.
The process begins with Anchor Selection, where raw data triplets (Context, Question, Answer) are extracted from standard QA datasets like SQuAD. These serve as the foundation for learning.
Next is the Self-Instruct Data Generation Engine, the driving force behind SI-FACT. Using the base LLM itself as a teacher, specially designed prompts are used to automatically create a contrastive set for each anchor. This set includes:
- Positive Samples: Rewritten versions of the golden answer that maintain factual accuracy but use different wording, helping the model learn semantic robustness.
- Negative Samples: These are crucial and are designed to simulate unfaithful scenarios. There are three types:
- Answers injected with external, unmentioned information (simulating hallucination).
- Answers that directly conflict with the provided context.
- Irrelevant answers that, while potentially based on some context, do not directly address the question.
By generating these structured negative samples, the model is forced to learn fine-grained discriminative abilities.
Following data generation, Contrastive Learning takes place. Here, the LLM becomes the student, trained using the self-generated contrastive dataset. The objective is to pull the representations of faithful responses (anchors and positive samples) closer together in the model’s internal space, while simultaneously pushing the representations of unfaithful responses (negative samples) farther apart. This is achieved using an InfoNCE loss function, which helps the model learn to distinguish between faithful and unfaithful answers at a fundamental level.
Finally, through Capability Internalization, the model integrates this enhanced contextual faithfulness, completing one cycle of self-improvement. In theory, an improved model can then generate even higher-quality data for subsequent cycles, creating a virtuous learning loop.
Impressive Results and High Efficiency
Experiments conducted on challenging knowledge conflict benchmarks, ECARE_KRE and COSE_KRE, demonstrated SI-FACT’s superior performance. The framework, based on Llama3-8B-Instruct, significantly improved the Contextual Recall Rate (CRR) – a measure of how often the model’s answers align with the given context. On the ECARE_KRE dataset, SI-FACT achieved a CRR of 75.97%, surpassing the best baseline by over 6.2%. It also achieved the lowest Parametric Recall Rate (PRR), indicating a strong suppression of the model’s tendency to rely on conflicting internal knowledge.
A key finding was SI-FACT’s remarkable data efficiency. The model’s performance peaked with only 1000 self-generated training samples, highlighting that targeted, actively generated data is far more effective than random or large-scale but less focused datasets. This makes SI-FACT a practical and scalable solution for enhancing LLM reliability even under resource constraints.
Furthermore, the SI-FACT framework was shown to largely preserve the model’s general capabilities across benchmarks for knowledge question answering, mathematical reasoning, commonsense reasoning, and complex question answering. Any minor performance trade-offs were deemed acceptable, suggesting a targeted optimization without significantly damaging the model’s existing knowledge system.
Also Read:
- Small Language Models Show Promise in Formal Logic Reasoning for Ontology Engineering
- A New AI System to Support Goat Health and Farm Management
A Path Towards Trustworthy LLMs
The SI-FACT framework offers a promising solution to the critical problem of knowledge conflict in LLMs. By enabling models to self-generate high-quality contrastive data and learn to distinguish faithful from unfaithful responses, it significantly enhances contextual faithfulness. This approach provides a practical and scalable pathway toward building more proactive and trustworthy language models, especially for high-stakes applications. Future research aims to apply SI-FACT to larger models, integrate it with Retrieval-Augmented Generation (RAG) systems, and extend the self-improvement paradigm to other crucial capabilities like safety and bias mitigation. You can read the full research paper here: SI-FACT: Mitigating Knowledge Conflict via Self-Improving Faithfulness-Aware Contrastive Tuning.


