spot_img
HomeResearch & DevelopmentEnhancing AI Retrieval: How Knowledge Graphs and Ontologies Boost...

Enhancing AI Retrieval: How Knowledge Graphs and Ontologies Boost RAG Performance

TLDR: This research compares different methods of building Knowledge Graphs (KGs) for Retrieval-Augmented Generation (RAG) systems. It finds that KGs guided by ontologies, especially when incorporating text chunks, significantly improve RAG performance, matching state-of-the-art systems. Notably, ontologies derived from stable relational databases offer similar performance to text-derived ones but with substantial cost and maintenance advantages, making them a scalable and efficient solution for integrating structured knowledge into RAG pipelines.

In the rapidly evolving landscape of Artificial Intelligence, Large Language Models (LLMs) have shown remarkable abilities in generating text and reasoning. However, their knowledge is often limited to their training data, making it challenging for them to access new or domain-specific information. This is where Retrieval-Augmented Generation (RAG) systems come into play, grounding LLM outputs in external knowledge to reduce inaccuracies and ‘hallucinations’.

Traditionally, RAG systems rely on vector databases, which store text as numerical embeddings for semantic similarity searches. While efficient, this approach often overlooks the crucial relational structure between pieces of information, potentially leading to less precise retrieval and limited reasoning capabilities. To address this, Graph-based RAG approaches leverage Knowledge Graphs (KGs) – structured representations of knowledge that encode explicit relationships between entities.

The Challenge of Knowledge Graph Construction

Building and maintaining high-quality Knowledge Graphs is a complex task. Recent advancements in ontology learning, which involves formally capturing domain concepts and relations, offer new ways to automate KG construction. LLMs themselves can be instrumental in extracting ontologies from both structured (like databases) and unstructured (like text) data. However, the full potential of ontology-guided KGs, especially those derived from relational databases, in the context of RAG systems has remained largely unexplored.

A New Study Illuminates the Path

A recent study by Tiago da Cruz, Bernardo Tavares, and Francisco Belo from Granter.ai delves into this gap, systematically comparing how different KG construction strategies influence RAG performance. The researchers evaluated several approaches using a real-world dataset from a grant application:

  • Standard Vector-based RAG (a common baseline)
  • GraphRAG (Microsoft Research’s state-of-the-art system)
  • Retrieval over KGs built from ontologies derived from relational databases
  • Retrieval over KGs built from ontologies extracted directly from text

The core of their investigation was to understand the impact of the ontology source and structure on retrieval accuracy and reasoning quality.

Key Findings: Ontology-Guided KGs Shine

The study yielded significant insights:

The most accurate results were achieved by GraphRAG and both ontology-guided Knowledge Graphs that incorporated ‘chunk information’ – meaning textual segments were integrated directly into the graph structure. These methods correctly answered 18 out of 20 questions (90%), demonstrating that ontology-constrained KGs, when enriched with text chunks, can match the performance of leading frameworks.

Conversely, ontology-based graphs without chunk information performed poorly, highlighting the critical role of integrating text segments into graph nodes for factual grounding and completeness.

The baseline Vector RAG achieved 60% accuracy, confirming its reliability but also its limitations compared to graph-based methods in tasks requiring relational reasoning.

Also Read:

The Dual Advantage of Database-Derived Ontologies

A particularly compelling finding was that aligning a Knowledge Graph with an ontology extracted from a static relational database performed comparably to ontologies derived from dynamic text corpora. This approach offers two major practical advantages:

  1. Cost Efficiency: Ontology learning from relational databases typically needs to be performed only once, as database schemas tend to be stable. This significantly reduces the computational costs associated with repeated LLM inference, which is often required in text-based approaches.
  2. Simplified Maintenance: This method eliminates the need for complex ontology-merging frameworks. In text-based ontology learning, new documents can introduce redundant or conflicting entities, necessitating sophisticated merging and alignment. Database-derived ontologies, however, provide a stable schema that simplifies maintenance and updates.

These findings suggest that extracting ontologies from relational databases provides a scalable and cost-efficient solution for integrating structured knowledge into RAG pipelines, especially beneficial for industries with stable and structured data sources.

The research paper, titled “Ontology Learning and Knowledge Graph Construction: A Comparison of Approaches and Their Impact on RAG Performance,” offers practical strategies for building scalable and interpretable Graph-based RAG systems. You can read the full paper here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -