Protecting Sensitive Information in Large Language Models: Introducing DP-FUSION

TLDR: DP-FUSION is a new method for Large Language Models (LLMs) that protects sensitive information (like personal data) during text generation. Unlike previous approaches that either severely degrade text quality or offer weak privacy, DP-FUSION provides strong, provable privacy guarantees at the token level while maintaining high text utility. It works by processing sensitive data in privacy groups and blending LLM outputs, offering a controllable balance between privacy and text quality, though it requires more computational resources.

Large Language Models (LLMs) are powerful tools, but their widespread use brings a significant challenge: how to prevent them from accidentally or intentionally revealing sensitive information present in the data they process. Imagine a hospital using an LLM to help patients with medical records; if the LLM could reveal a patient’s disease history or unique treatment plan, it would raise serious privacy concerns. Existing methods to protect this sensitive data during the LLM’s inference (when it generates text) often fall short, either by lacking formal privacy guarantees or by making the generated text unusable.

The Privacy Problem with LLMs

When LLMs generate text, they operate on a “context” – the input data that might contain private details, such as Personally Identifiable Information (PII) like names or addresses. Current solutions to protect this data include simply removing PII (using Named Entity Recognition, or NER) or instructing the model to paraphrase without leaking sensitive details. However, simply removing PII can severely damage the quality and usefulness of the text, especially if many sensitive pieces of information need to be removed. Even advanced NER systems can be inaccurate and only target very obvious sensitive data. Prompt engineering, while seemingly helpful, has been shown to be vulnerable to attacks and offers no formal privacy guarantees.

Introducing DP-FUSION: A New Approach to Private Inference

To address these limitations, researchers have developed DP-FUSION, a novel mechanism for Differentially Private Inference (DPI). This method provides a provable way to limit how much an LLM’s output reveals about sensitive tokens in its input context. The core idea is to ensure that observing the LLM’s output doesn’t allow an attacker to reliably infer sensitive information, even if they try to adaptively query the model.

DP-FUSION works by carefully managing the privacy-utility trade-off, controlled by a parameter called epsilon (ϵ). A value of ϵ=0 means sensitive information is completely hidden, while higher values allow for better text quality at the cost of slightly less privacy. The mechanism operates in a few key steps:

Sensitive tokens in the document are first divided into distinct “privacy groups” (e.g., names, dates, codes).
The LLM is then run multiple times, once for each privacy group.
Finally, the probability distributions of the LLM’s outputs from these different runs are blended together. This blending ensures that the final generated text remains statistically close to what would be produced if no sensitive information were revealed, thereby bounding the potential leakage.

While this approach requires the LLM to perform multiple “forward passes” (meaning it uses more computational resources), recent advancements in parallel processing on GPUs make it practical.

How DP-FUSION Compares to Other Methods

The researchers tested DP-FUSION against existing DPI methods like DP-Decoding and DP-Prompt, as well as simpler baselines like direct PII removal. They used a dataset of legal documents (TAB-ECHR) annotated with various types of personal information. The evaluation focused on two main aspects: utility (how good the generated text is) and privacy (how hard it is for an attacker to infer sensitive information).

In terms of utility, measured by “perplexity” (a measure of how well a language model predicts a sample of text) and an “LLM-as-a-judge” evaluation (where another LLM assesses the quality of the paraphrase), DP-FUSION significantly outperformed existing DPI mechanisms. For instance, DP-FUSION maintained high text quality while other methods produced heavily degraded or even unusable outputs.

Regarding privacy, assessed by “Attack Success Rate” (ASR) in a “token-recovery game” where an attacker tries to guess sensitive tokens, DP-FUSION achieved privacy levels comparable to simply removing all sensitive information, but with the added benefit of formal privacy guarantees and a controllable trade-off. This means it can offer a strong privacy shield without sacrificing the usefulness of the generated text.

Also Read:

Looking Ahead

DP-FUSION represents a significant step forward in making LLMs safer for handling sensitive data. While it offers a much-improved balance between privacy and utility, the researchers acknowledge some limitations. Its effectiveness relies on the accuracy of the system used to identify sensitive tokens, and the privacy parameters can be complex for non-experts. Future work aims to provide more intuitive ways for users to control privacy and explore how different sensitive groups might interact. Despite requiring more computational power, the benefits of DP-FUSION in enabling more widespread and secure use of LLMs for sensitive applications are substantial. For more technical details, you can refer to the full research paper: DP-FUSION: Token-Level Differentially Private Inference for Large Language Models.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Protecting Sensitive Information in Large Language Models: Introducing DP-FUSION

The Privacy Problem with LLMs

Introducing DP-FUSION: A New Approach to Private Inference

How DP-FUSION Compares to Other Methods

Looking Ahead

Gen AI News and Updates

Ghana Navigates Complexities in AI Regulatory Development Amidst Coordination Challenges

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Vatican Summit Addresses Ethical Imperatives of AI in Healthcare

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates