spot_img
HomeResearch & DevelopmentStreamlining Knowledge Graph Completion with Probabilistic Circuits

Streamlining Knowledge Graph Completion with Probabilistic Circuits

TLDR: A new research paper introduces a framework that uses probabilistic circuits to significantly reduce the number of rules needed for knowledge graph completion, while maintaining or improving performance. This approach addresses the ‘rule set explosion’ problem in traditional rule-based methods, enhancing explainability and computational efficiency. The framework achieves a 70-96% reduction in rules and up to 31x performance improvement over baselines, preserving high accuracy without requiring rule independence assumptions.

Knowledge graphs are fundamental to artificial intelligence, serving as structured repositories of information that power everything from search engines to recommendation systems. A key challenge in this field is “knowledge graph completion,” which involves inferring missing facts to make these graphs more comprehensive. Traditionally, two main approaches exist: embedding-based methods, which are highly accurate but operate like a “black box” without providing clear explanations for their predictions, and rule-based methods, which offer valuable explainability through logical inference chains.

While rule-based methods, such as AnyBURL and AMIE, provide transparent predictions, they face a significant hurdle: the “rule set explosion.” To achieve competitive performance, these systems often need to learn and utilize an overwhelmingly large number of rules, sometimes tens of thousands. This vast quantity of rules undermines the very advantage of explainability, as humans struggle to comprehend thousands of logical inferences for a single prediction. Furthermore, managing such large rule sets dramatically increases computational complexity for reasoning tasks.

A new research paper, titled “Probabilistic Circuits for Knowledge Graph Completion with Reduced Rule Sets,” introduces an innovative framework designed to overcome this challenge. The core idea is to achieve high performance in knowledge graph completion using a significantly smaller, more manageable set of rules, thereby preserving the crucial aspect of explainability. You can read the full paper here.

A Novel Approach with Probabilistic Circuits

The proposed framework leverages probabilistic circuits (PCs) to learn a probability distribution over “rule contexts.” Rule contexts are defined as meaningful subsets of rules that naturally work together. Unlike traditional methods, this approach does not require assumptions about the independence of rules, allowing for a more nuanced understanding of how rules interact.

By learning this distribution, the framework can identify and select highly effective rule sets. For inference, it uses the learned probabilities to assign confidence to predictions. The paper explores three distinct inference methods: PC1, which approximates query probability using singleton rule sets; PC2, which computes the exact query probability; and PC3, which approximates query probability using rule sets generated from a greedy walk approach.

Impressive Results and Efficiency Gains

The researchers validated their framework through empirical studies on eight standard benchmark datasets, including FB15K-237, WN18RR, and UMLS. The results are compelling: the PC-guided framework achieved a remarkable 70-96% reduction in the number of rules used, representing an average 12-fold decrease. Despite this drastic reduction, the framework not only maintained competitive performance but often outperformed the baseline by up to 31 times when comparing equivalent minimal rule sets.

For instance, on the UMLS dataset, PC2 inference achieved the baseline’s peak Hits@10 performance (a metric indicating how often the correct answer is among the top 10 predictions) using only 1,000 rules, whereas the baseline required 25,000 rules—a 25-fold reduction in rule usage. Similarly, for CODEX-S, PC2 achieved nearly 100% of the baseline’s highest Hits@10 with only 5% of the rules. Across all datasets, the PC-guided inferences preserved an average of 91% of the highest baseline performance, even when comparing their minimal rule sets against the baseline’s full rule sets. This demonstrates that improved explainability does not come at the cost of accuracy.

The framework also showed consistent superiority in Mean Rank Reciprocal (MRR) performance, which measures the average reciprocal rank of correct predictions. This indicates that the system not only makes correct predictions but also ranks them highly, reflecting high-quality probabilistic inference.

Also Read:

Scalability and Future Directions

A significant advantage of this new framework is its linear scalability with the number of input rules, particularly for the PC1 and PC2 inference methods. This addresses a critical computational challenge faced by traditional rule-based systems. While PC3 (greedy walks) is more computationally intensive, PC1 and PC2 offer efficient and effective solutions.

The researchers highlight that their framework is agnostic to the underlying rule sources, meaning it can be applied beyond AnyBURL to other rule learning systems and even other domains of symbolic AI that struggle with rule explosion and computational complexity, such as proof search in theorem proving systems. This work opens doors for more interpretable and efficient AI systems.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -