TLDR: GraphDPO is a novel framework for efficiently unlearning outdated or incorrect information from Knowledge Graph Embedding (KGE) models. It uses direct preference optimization with an out-boundary sampling strategy to effectively remove targeted knowledge and a boundary recall mechanism to preserve surrounding information. Experiments show GraphDPO outperforms existing methods in both forgetting and retention, while being significantly more time-efficient and scalable across different KGE models.
Knowledge graphs, which organize information about entities and their relationships, are vital for many AI applications like question answering and semantic search. However, these vast repositories of information are constantly evolving, meaning they can quickly accumulate outdated or incorrect facts. Imagine a knowledge graph storing information about a person’s job; if that person changes careers, the old information becomes inaccurate and needs to be removed.
The challenge lies in “unlearning” this specific knowledge from the models built upon these graphs without disrupting the vast amount of correct and relevant information. Traditional methods for unlearning often fall into two categories: “exact unlearning,” which is very costly as it requires retraining the entire model from scratch, and “approximate unlearning,” which is faster but struggles to completely erase targeted information (because related facts might still allow the forgotten information to be inferred) and can inadvertently damage surrounding, correct knowledge.
A new research paper, “Unlearning of Knowledge Graph Embedding via Preference Optimization,” introduces an innovative framework called GraphDPO to tackle these issues. Developed by Jiajun Liu, Wenjun Ke, Peng Wang, Yao He, Ziyu Shang, Guozheng Li, Zijie Xu, and Ke Ji, GraphDPO offers a more efficient and effective way to remove unwanted knowledge while preserving the integrity of the remaining data. You can read the full paper here: Research Paper.
GraphDPO approaches knowledge unlearning as a “preference optimization” problem. This means the model is trained to “prefer” alternative, reconstructed information over the original, incorrect facts. By doing so, it actively discourages the model from relying on any knowledge that needs to be forgotten. To make this process even more effective, GraphDPO uses an “out-boundary sampling” strategy. This technique carefully selects alternative information that is semantically distinct from the forgotten data, further ensuring that the unwanted knowledge cannot be indirectly inferred from retained connections within the graph.
Furthermore, to prevent the accidental loss of valuable information near the “forgetting boundary” (the edges of the knowledge being removed), GraphDPO incorporates a “boundary recall mechanism.” This mechanism intelligently reviews and reinforces relevant knowledge, both within immediate connections and across different time points, ensuring that the model retains its understanding of the surrounding context.
The researchers rigorously tested GraphDPO on eight different datasets, simulating various unlearning scenarios with 10% and 20% of knowledge needing to be forgotten. The results were highly promising. GraphDPO consistently outperformed existing approximate unlearning methods, showing significant improvements in both forgetting the target knowledge and preserving the remaining information. For instance, it achieved up to 10.1% better performance in average mean reciprocal rank (MRRAvg) and 14.0% better in F1 score (MRRF1) compared to other leading methods.
Beyond its effectiveness, GraphDPO also demonstrated remarkable efficiency. It significantly reduced training time, saving between 69% and 80% compared to exact unlearning methods, and was up to 88% faster than other approximate unlearning techniques. This makes it a practical solution for real-world applications where computational resources might be limited. The framework also proved to be adaptable, working effectively across various types of knowledge graph embedding models, indicating its broad applicability.
Also Read:
- New Framework Boosts Temporal Knowledge Graph Predictions for Evolving Data
- Enhancing Knowledge Graph Completion with Complementary Multimodal Data
In essence, GraphDPO represents a significant step forward in machine unlearning for knowledge graphs. By intelligently combining preference optimization with graph-aware sampling and boundary preservation, it offers a robust and efficient solution for keeping knowledge graphs accurate and up-to-date, ensuring that AI systems always operate on the most relevant and correct information.


