TLDR: GenOM is a novel AI framework that enhances ontology matching by using large language models (LLMs) to generate detailed textual definitions for concepts. It then employs an embedding model to retrieve alignment candidates and an LLM for binary equivalence judgment, integrating these with exact matching techniques. Tested on biomedical datasets, GenOM demonstrates competitive performance, outperforming many baselines and showing robustness through semantic enrichment and few-shot prompting, making knowledge integration across heterogeneous systems more effective.
In today’s data-rich world, especially within complex fields like biomedicine, integrating information from various sources is crucial. Imagine trying to combine patient data from different hospitals, each using its own unique way of describing diseases or medications. This is where ‘Ontology Matching’ (OM), also known as ontology alignment, comes into play. It’s the process of identifying semantic correspondences between entities in different ontologies – essentially, finding out which concepts in one system mean the same or are related to concepts in another.
Understanding Ontology Matching
Ontologies are formal representations of concepts and relationships within a specific domain. However, they are often created independently, leading to differences in terminology, structure, and detail. These variations make it incredibly challenging to integrate and reuse knowledge effectively. For instance, the same medical condition might be called by different names (terminological difference), or its description might be organized in a deeply nested hierarchy in one system and a flat list in another (structural difference). As ontologies grow, like SNOMED-CT with hundreds of thousands of medical concepts, manual alignment becomes impossible, highlighting the need for automated solutions.
Traditional OM systems often rely on string matching and structural comparisons, which can miss the deeper semantic meaning of concepts. While recent advancements have seen Large Language Models (LLMs) incorporated into OM, some approaches still struggle with complex tasks or demand immense computational power due to very large models.
Introducing GenOM: A New Approach
To address these limitations, researchers Yiping Song, Jiaoyan Chen, and Renate A. Schmidt from The University of Manchester have introduced GenOM, a novel ontology matching framework. GenOM leverages the power of LLMs to enhance the semantic understanding of ontology concepts, making the alignment process more accurate and efficient. The framework is designed to be robust and adaptable, demonstrating competitive performance, particularly in the biomedical domain.
How GenOM Works: The Five Key Steps
GenOM operates through a modular, five-component architecture:
1. Ontology Data Extraction: First, GenOM extracts both lexical (like labels and synonyms) and structural information (like parent concepts and logical definitions) from the source and target ontologies. This provides a rich foundation for understanding each concept.
2. Definition Generation: A key innovation is using an LLM to generate natural language definitions or paraphrased descriptions for each concept. This step is vital for concepts that lack explicit textual definitions, enriching their semantic representation and helping the LLM recall relevant domain knowledge.
3. Candidate Mapping Generation: With these enriched descriptions, concepts are converted into numerical vector representations (embeddings). GenOM then uses an embedding model to calculate similarity scores between concepts from different ontologies, identifying a shortlist of the most semantically similar candidate pairs.
4. LLM-Based Equivalence Judgement: For each candidate pair, a lightweight LLM is prompted to make a binary decision: YES if the concepts are semantically equivalent, and NO otherwise. This classification-based approach is efficient, relying on the probability of the ‘YES’ token to determine confidence.
5. Post-processing and Result Fusion: In the final stage, GenOM refines the results by filtering out low-confidence matches based on both the LLM’s probability score and the embedding similarity. To further enhance precision and coverage, these results are then merged with outputs from traditional exact matching systems, combining deep semantic reasoning with surface-level matching.
Putting GenOM to the Test
GenOM was rigorously evaluated on the OAEI 2024 Bio-ML track, a benchmark for biomedical ontology alignment tasks involving widely used ontologies like SNOMED-CT and NCIT. The results were impressive: GenOM consistently achieved strong performance across all tasks, often ranking among the top three systems. It notably outperformed other LLM-based systems like LLM4OM.
Ablation studies further confirmed the effectiveness of GenOM’s components. Generating concept definitions significantly improved both the LLM’s ability to judge equivalence and the accuracy of candidate retrieval. Furthermore, GenOM demonstrated a substantial improvement in performance, particularly in recall, compared to standalone exact matching systems. The research also highlighted that providing the LLM with a few examples (few-shot prompting) consistently improved its classification accuracy.
Also Read:
- Unlocking Rare Disease Insights: How Language Models Are Transforming Information Extraction
- M2LLM: A Multi-View Approach to Understanding Molecules with AI
Key Findings and Future Directions
GenOM stands out as a general-purpose framework that effectively integrates semantic enrichment, LLM-based reasoning, and traditional matching techniques. It shows strong ability to generalize across different datasets without needing extensive task-specific adjustments. However, challenges remain, such as consistently assessing the precise degree of equivalence, as the definition of ‘equivalent’ can subtly vary across different tasks and ontologies. The sensitivity of LLMs to prompt phrasing also underscores the importance of careful prompt design.
Future work for GenOM includes expanding its capabilities to identify other types of relationships beyond just equivalence, such as subsumption (where one concept is a more general or specific variant of another). Researchers also aim to develop task-adaptive alignment criteria, allowing the system to dynamically adjust its understanding of equivalence based on context or domain-specific nuances.
For more in-depth information, you can read the full research paper available here.


