Unlearning Data: A Deeper Look at Privacy Risks in AI Models

TLDR: This paper introduces new criteria and an efficient tool (A-LiRA) to audit approximate machine unlearning methods, especially for differentially private AI models. It reveals that current unlearning techniques often fail to adequately protect the privacy of removed data and can inadvertently increase the privacy risks for the data that remains in the model, highlighting a critical gap in current privacy-preserving AI practices.

The “right to be forgotten” is a fundamental privacy principle gaining traction globally, empowering individuals to request the deletion of their personal data from organizations’ records. In the realm of artificial intelligence and machine learning, this right translates into “machine unlearning” – the process of removing the influence of specific data points from a trained model without having to retrain the entire model from scratch. This is particularly crucial for large, complex AI models where full retraining is prohibitively expensive and time-consuming.

While machine unlearning sounds like a straightforward solution to data deletion requests, a recent research paper titled “Auditing Approximate Machine Unlearning for Differentially Private Models” by Yuechun Gu, Jiajie He, and Keke Chen from the University of Maryland, Baltimore County, sheds light on a critical, often overlooked aspect of this process: the privacy of the data that remains in the model.

The Unseen Privacy Challenge

Existing machine unlearning methods primarily focus on ensuring that the removed data’s influence is erased. The common assumption has been that the privacy of the retained data remains unaffected. However, this paper challenges that assumption, especially when dealing with models designed to be “differentially private.” Differential privacy (DP) is a strong mathematical guarantee that ensures individual data points do not significantly alter the output of an algorithm, thereby protecting privacy. When a model is differentially private, there’s an agreed-upon privacy budget (epsilon, denoted as ϵ) that defines the maximum acceptable privacy risk for any data point.

The researchers highlight a phenomenon known as the “privacy onion effect,” where removing some data points can inadvertently increase the privacy risks of other, retained data points. This is akin to thinning a crowd – the fewer people there are, the easier it is to identify individuals. This paper is the first to explore how this effect impacts differentially private models under existing approximate machine unlearning methods.

New Criteria for Comprehensive Privacy Auditing

To address this gap, the authors propose a holistic approach to auditing machine unlearning algorithms. They introduce two new privacy criteria for differentially private models:

Criterion 1 (Unlearned Samples): The privacy risk of the unlearned samples should be significantly reduced to a safe, pre-defined level after the unlearning process. Ideally, their risk should approximate what it would be if the model had never seen them.
Criterion 2 (Retained Samples): The privacy risk of the retained samples must remain below the original differential privacy budget (ϵ) agreed upon for the model. Unlearning should not inadvertently expose these samples to higher privacy risks.

These criteria are measured using Membership Inference Attacks (MIAs), which determine whether a specific data point was part of the training dataset. A higher True Positive Rate (TPR) at a given False Positive Rate (FPR) indicates a higher privacy risk.

Introducing A-LiRA: An Efficient Auditing Tool

To make this auditing process practical and efficient, the researchers developed a novel Membership Inference Attack called A-LiRA (Augmentation-based Likelihood Ratio Attack). Traditional powerful MIAs, like Online-LiRA, are computationally very expensive, often requiring many GPU hours per sample. A-LiRA significantly reduces this cost by utilizing data augmentation and approximating probability distributions, achieving comparable accuracy to Online-LiRA with an 88.3% reduction in time. This efficiency makes it feasible to audit machine unlearning methods on a larger scale.

Concerning Findings: Current Methods Fall Short

The experimental findings, using A-LiRA, revealed some concerning truths about current approximate machine unlearning algorithms (such as SUNSHINE, SSD, and SalUn):

Unlearned Samples: While approximate unlearning generally reduced the privacy risk of unlearned samples in differentially private models, in non-DP models, a small fraction of unlearned samples paradoxically became more sensitive after removal. This suggests that unlearning isn’t always a perfect privacy shield even for the data it targets.
Retained Samples: More critically, the study found that existing approximate machine unlearning algorithms often inadvertently compromise the privacy of retained samples in differentially private models. This means that after unlearning, some retained data points exceeded the original privacy budget (ϵ), effectively breaking the differential privacy guarantee and user agreements. This “privacy onion effect” was consistently observed, with unlearning more sensitive samples leading to more retained samples having their privacy breached.

Also Read:

The Path Forward

The research paper, available at arXiv:2508.18671, concludes that the current landscape of approximate machine unlearning methods is not sufficient for comprehensively protecting privacy, especially in the context of differentially private models. The findings underscore an urgent need for the development of new “differentially private unlearning algorithms” that can simultaneously reduce the privacy risk of unlearned samples and maintain the privacy guarantees for all retained data. The proposed auditing criteria and the efficient A-LiRA tool offer a robust framework for evaluating and guiding the development of such next-generation unlearning techniques.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unlearning Data: A Deeper Look at Privacy Risks in AI Models

The Unseen Privacy Challenge

New Criteria for Comprehensive Privacy Auditing

Introducing A-LiRA: An Efficient Auditing Tool

Concerning Findings: Current Methods Fall Short

The Path Forward

Gen AI News and Updates

Ghana Navigates Complexities in AI Regulatory Development Amidst Coordination Challenges

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Vatican Summit Addresses Ethical Imperatives of AI in Healthcare

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates