TLDR: ReaLM is a novel framework that significantly improves Knowledge Graph Completion (KGC) by seamlessly integrating structured knowledge graph embeddings with large language models (LLMs). It achieves this by transforming continuous KG embeddings into compact, discrete code sequences using residual vector quantization, which are then learned as new tokens by the LLM. The framework also incorporates ontology-guided class constraints to ensure semantic consistency in predictions. Experiments demonstrate that ReaLM achieves state-of-the-art performance in link prediction and triple classification tasks, effectively bridging the gap between symbolic and contextual knowledge.
Large Language Models (LLMs) have shown immense potential in understanding and generating human-like text, but they often struggle when it comes to integrating and reasoning with highly structured information found in Knowledge Graphs (KGs). KGs, which organize facts as triples of entities and relations (like “Iron Man has wife Pepper Potts”), are crucial for many applications, from search engines to recommender systems. However, KGs are often incomplete, and filling in these missing links, a task known as Knowledge Graph Completion (KGC), is a significant challenge.
The core problem lies in a fundamental mismatch: KGs represent knowledge in a continuous embedding space, while LLMs operate on discrete tokens (words or sub-words). This discrepancy makes it difficult for LLMs to fully leverage the rich, structured semantics of KGs, leading to inconsistencies and limiting their performance in KGC tasks.
Introducing ReaLM: Bridging the Gap
To address this, researchers have introduced ReaLM (Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models), a novel framework designed to seamlessly integrate KG embeddings with LLM tokenization. ReaLM tackles the continuous-to-discrete challenge through a clever mechanism called residual vector quantization.
How ReaLM Works
The ReaLM framework operates in several key stages:
First, it starts by extracting high-quality semantic embeddings from the Knowledge Graph. These embeddings are continuous numerical representations that capture the meaning and relationships of entities within the KG. The RotatE model is used for this initial step, as it has proven effective in capturing complex relational patterns.
Next, these continuous KG embeddings undergo a process called residual vector quantization. Imagine taking a detailed photograph (the continuous embedding) and converting it into a sequence of compact digital codes (the discrete representation) without losing too much detail. ReaLM does this by approximating each entity’s embedding through a sequence of refinements across multiple stages, each selecting a “codeword” from a predefined set. This results in a compact sequence of code indices for each entity, effectively digitizing the KG knowledge.
These newly generated code sequences are then integrated into the LLM. ReaLM expands the LLM’s vocabulary to include these compact code tokens. The embeddings for these new tokens are carefully initialized from the learned codebooks, ensuring they carry the semantic information from the KG. During fine-tuning, the LLM learns to interpret and generate these discrete KG representations alongside natural language. This is done efficiently using a technique called Low-Rank Adaptation (LoRA), which adapts the LLM’s internal parameters without requiring extensive computational resources.
Finally, ReaLM incorporates ontology-guided class constraints. This means that beyond just predicting an entity, the model also considers its class (e.g., if it predicts a person, it ensures the predicted entity is indeed a person). This mechanism enforces semantic consistency, refining entity predictions and enhancing overall accuracy and reliability.
Also Read:
- Deepening Knowledge Integration: How Semantic-Condition Tuning Enhances LLMs for Knowledge Graph Completion
- Unlocking Enterprise Knowledge: A Hybrid Approach to Scalable and Explainable Information Retrieval
Performance and Impact
Extensive experiments on widely used benchmark datasets, FB15k-237 and WN18RR, demonstrate that ReaLM achieves state-of-the-art performance in both link prediction (inferring missing relationships) and triple classification (determining if a fact is true or false). The results highlight that the integration of ontology knowledge is particularly crucial for achieving high accuracy, especially for top-rank predictions.
The research shows that carefully tuning the residual vector quantization parameters, such as the codebook size and the number of quantization stages, is vital for balancing reconstruction fidelity and the compactness of the token sequences. This ensures that the quantized codes are both semantically rich and compatible with the LLM’s token space.
By effectively bridging the gap between continuous KG embeddings and discrete LLM tokens, ReaLM offers a powerful new way to enhance LLMs with structured knowledge, leading to more accurate and semantically consistent reasoning in knowledge-intensive tasks. You can read the full research paper here.


