EAGLE-PC: A New Approach to Prevent Over-Forgetting in Large Language Model Unlearning

TLDR: EAGLE-PC is a novel framework for machine unlearning in Large Language Models (LLMs) that addresses the problem of ‘excessive forgetting’ (over-forgetting). It uses two main components: Entanglement-Awareness Guided Loss Reweighting (EAGLE) to adaptively adjust forgetting effort based on how entangled forget data is with retain data, and a Proxy Constraint that uses In-Context Learning (ICL)-generated test data to softly regularize the unlearning process, preventing models from losing too much useful knowledge. The framework is compatible with existing unlearning methods and shows significant improvements in balancing forgetting quality and model utility on benchmarks like TOFU and MUSE.

Large language models, or LLMs, are incredibly powerful, capable of absorbing and retaining vast amounts of information from the internet. While this memorization is key to their intelligence, it also brings significant concerns, particularly regarding privacy and data ownership. Imagine an LLM trained on data that includes your personal information or copyrighted material. As data owners increasingly request the removal of their data from these models, a field called ‘machine unlearning’ has emerged as a practical solution.

Machine unlearning aims to remove the influence of specific data from a trained model without the costly and time-consuming process of retraining the entire model from scratch. However, existing unlearning methods often face a critical challenge: ‘excessive forgetting,’ also known as over-forgetting. This happens when the model not only forgets the targeted information but also inadvertently loses useful knowledge, leading to degraded performance on other tasks or even weakening the model’s safety guidelines. Some data might be ‘under-forgotten,’ leaving residual privacy risks, while other, unrelated data might be ‘over-forgotten,’ harming the model’s overall utility.

The core problem stems from two main factors: the diverse nature of the data to be forgotten (some are more deeply memorized or entangled with other knowledge than others), and the lack of clear stopping points in the unlearning process. Current methods often treat all data to be forgotten uniformly or rely on metrics that don’t fully capture the complex interactions between the data to be forgotten and the data to be retained.

To address these limitations, researchers have proposed a novel unlearning framework called EAGLE-PC, which stands for Entanglement-Awareness Guided Loss Reweighting with Proxy Constraint. This framework introduces two key components designed to make unlearning more precise and effective.

Entanglement-Awareness Guided Loss Reweighting (EAGLE)

The first component, EAGLE, tackles the problem of heterogeneous memorization. Instead of assuming all pieces of information to be forgotten are equally difficult to erase, EAGLE measures how ‘entangled’ each piece of forget data is with the data the model is supposed to retain. It does this by calculating the similarity between the embedding (a numerical representation) of a forget sample and the aggregated embedding of the entire retain dataset. Think of it like this: if a piece of information you want to forget is very similar to many things you still want to remember, the model needs to be more careful when erasing it to avoid collateral damage. EAGLE dynamically adjusts the ‘forgetting effort’ for each sample based on this entanglement. Less entangled samples, which are more unique and less connected to the retained knowledge, can be forgotten more aggressively. This approach is computationally efficient because it only requires the average embedding of the retain dataset, not access to every single retain sample.

Also Read:

Proxy Constraint

The second component, the Proxy Constraint, addresses the issue of unbounded unlearning and prevents over-forgetting. Gradient-based unlearning methods, while effective, often lack a natural stopping point, meaning they can keep ‘forgetting’ indefinitely, potentially causing the model’s predictions to diverge excessively. The Proxy Constraint introduces a soft regularization mechanism using ‘proxy data’ generated through In-Context Learning (ICL) with another large language model. These proxy samples simulate how the model would naturally respond to the forget data if it had never been exposed to it in the first place. By comparing the model’s current forgetting performance against these proxy responses, the framework establishes an adaptive boundary. If the model starts to forget too aggressively, going beyond what a naturally ‘ignorant’ model would do, a penalty is applied. This mechanism guides the unlearning process, ensuring that knowledge removal remains controlled and preserves the model’s ability to generalize on retained information.

EAGLE-PC is designed as a ‘plug-and-play’ enhancement, meaning it’s compatible with existing gradient-based unlearning objectives like Gradient Ascent (GA), Gradient Difference (GD), and Negative Preference Optimization (NPO). The researchers evaluated EAGLE-PC on two widely used benchmarks, TOFU and MUSE, demonstrating consistent improvements in the crucial trade-off between forgetting quality and model utility across multiple LLMs, including Phi-1.5 and LLaMA2-7B. In some cases, EAGLE-PC combined with the “NPO+GD” optimizer even approached the performance of a model fully retrained from scratch, which is the ideal but most expensive scenario.

Remarkably, the framework’s entanglement-aware guidance, even when using only the average retain embedding, allowed weaker optimizers like EAGLE-PC(GA) to outperform stronger baselines that had access to the full retain dataset. This highlights the efficiency and effectiveness of the entanglement-awareness approach.

This work represents a significant step towards more trustworthy, scalable, and robust machine unlearning for real-world LLM deployments, especially in response to ‘right-to-be-forgotten’ demands. For more technical details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

EAGLE-PC: A New Approach to Prevent Over-Forgetting in Large Language Model Unlearning

Entanglement-Awareness Guided Loss Reweighting (EAGLE)

Proxy Constraint

Gen AI News and Updates

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Vatican Summit Addresses Ethical Imperatives of AI in Healthcare

Morgan Freeman Condemns Unauthorized AI Voice Replication, Citing Theft of Identity and Work

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates