TLDR: EAGLE-PC is a novel framework for machine unlearning in Large Language Models (LLMs) that addresses the problem of ‘excessive forgetting’ (over-forgetting). It uses two main components: Entanglement-Awareness Guided Loss Reweighting (EAGLE) to adaptively adjust forgetting effort based on how entangled forget data is with retain data, and a Proxy Constraint that uses In-Context Learning (ICL)-generated test data to softly regularize the unlearning process, preventing models from losing too much useful knowledge. The framework is compatible with existing unlearning methods and shows significant improvements in balancing forgetting quality and model utility on benchmarks like TOFU and MUSE.
Large language models, or LLMs, are incredibly powerful, capable of absorbing and retaining vast amounts of information from the internet. While this memorization is key to their intelligence, it also brings significant concerns, particularly regarding privacy and data ownership. Imagine an LLM trained on data that includes your personal information or copyrighted material. As data owners increasingly request the removal of their data from these models, a field called ‘machine unlearning’ has emerged as a practical solution.
Machine unlearning aims to remove the influence of specific data from a trained model without the costly and time-consuming process of retraining the entire model from scratch. However, existing unlearning methods often face a critical challenge: ‘excessive forgetting,’ also known as over-forgetting. This happens when the model not only forgets the targeted information but also inadvertently loses useful knowledge, leading to degraded performance on other tasks or even weakening the model’s safety guidelines. Some data might be ‘under-forgotten,’ leaving residual privacy risks, while other, unrelated data might be ‘over-forgotten,’ harming the model’s overall utility.
The core problem stems from two main factors: the diverse nature of the data to be forgotten (some are more deeply memorized or entangled with other knowledge than others), and the lack of clear stopping points in the unlearning process. Current methods often treat all data to be forgotten uniformly or rely on metrics that don’t fully capture the complex interactions between the data to be forgotten and the data to be retained.
To address these limitations, researchers have proposed a novel unlearning framework called EAGLE-PC, which stands for Entanglement-Awareness Guided Loss Reweighting with Proxy Constraint. This framework introduces two key components designed to make unlearning more precise and effective.
Entanglement-Awareness Guided Loss Reweighting (EAGLE)
The first component, EAGLE, tackles the problem of heterogeneous memorization. Instead of assuming all pieces of information to be forgotten are equally difficult to erase, EAGLE measures how ‘entangled’ each piece of forget data is with the data the model is supposed to retain. It does this by calculating the similarity between the embedding (a numerical representation) of a forget sample and the aggregated embedding of the entire retain dataset. Think of it like this: if a piece of information you want to forget is very similar to many things you still want to remember, the model needs to be more careful when erasing it to avoid collateral damage. EAGLE dynamically adjusts the ‘forgetting effort’ for each sample based on this entanglement. Less entangled samples, which are more unique and less connected to the retained knowledge, can be forgotten more aggressively. This approach is computationally efficient because it only requires the average embedding of the retain dataset, not access to every single retain sample.
Also Read:
- Optimizing In-Context Learning with Linear-Time Demonstration Selection
- Unlearning Data: A Deeper Look at Privacy Risks in AI Models
Proxy Constraint
The second component, the Proxy Constraint, addresses the issue of unbounded unlearning and prevents over-forgetting. Gradient-based unlearning methods, while effective, often lack a natural stopping point, meaning they can keep ‘forgetting’ indefinitely, potentially causing the model’s predictions to diverge excessively. The Proxy Constraint introduces a soft regularization mechanism using ‘proxy data’ generated through In-Context Learning (ICL) with another large language model. These proxy samples simulate how the model would naturally respond to the forget data if it had never been exposed to it in the first place. By comparing the model’s current forgetting performance against these proxy responses, the framework establishes an adaptive boundary. If the model starts to forget too aggressively, going beyond what a naturally ‘ignorant’ model would do, a penalty is applied. This mechanism guides the unlearning process, ensuring that knowledge removal remains controlled and preserves the model’s ability to generalize on retained information.
EAGLE-PC is designed as a ‘plug-and-play’ enhancement, meaning it’s compatible with existing gradient-based unlearning objectives like Gradient Ascent (GA), Gradient Difference (GD), and Negative Preference Optimization (NPO). The researchers evaluated EAGLE-PC on two widely used benchmarks, TOFU and MUSE, demonstrating consistent improvements in the crucial trade-off between forgetting quality and model utility across multiple LLMs, including Phi-1.5 and LLaMA2-7B. In some cases, EAGLE-PC combined with the “NPO+GD” optimizer even approached the performance of a model fully retrained from scratch, which is the ideal but most expensive scenario.
Remarkably, the framework’s entanglement-aware guidance, even when using only the average retain embedding, allowed weaker optimizers like EAGLE-PC(GA) to outperform stronger baselines that had access to the full retain dataset. This highlights the efficiency and effectiveness of the entanglement-awareness approach.
This work represents a significant step towards more trustworthy, scalable, and robust machine unlearning for real-world LLM deployments, especially in response to ‘right-to-be-forgotten’ demands. For more technical details, you can refer to the full research paper here.


