GEDAN: A New Framework for Understanding Graph Similarity with Learned Costs

TLDR: GEDAN is a novel Graph Neural Network framework that approximates Graph Edit Distance (GED), a measure of graph dissimilarity. It uniquely learns context-aware edit costs for graph transformations, addressing the unrealistic “unit-cost” assumption of previous methods. GEDAN offers both unsupervised (U-GEDAN), which doesn’t require ground-truth distances, and supervised (S-GEDAN) training. It achieves competitive performance with state-of-the-art models while significantly improving adaptability and interpretability, providing valuable insights into complex graph structures, particularly in molecular analysis.

Graphs are powerful tools for representing complex information, from social networks to molecular structures. A fundamental challenge in working with graphs is measuring how similar or dissimilar two graphs are. This is where Graph Edit Distance (GED) comes in. GED is defined as the minimum cost required to transform one graph into another through a series of operations like adding, deleting, or changing nodes and edges. It’s a widely accepted way to quantify graph dissimilarity.

However, calculating the exact Graph Edit Distance is incredibly difficult, especially for larger graphs, a problem known as NP-hard. This computational hurdle has led researchers to develop various approximation methods, including those that use neural networks. A common simplification in many existing neural network-based approaches is the assumption that all these graph transformation operations (edits) have a uniform, “unit cost.” This is often an unrealistic constraint in real-world applications, particularly in fields like chemistry or bioinformatics where the cost of changing an atom or a chemical bond can vary significantly based on its context.

A new research paper titled GEDAN: Learning the Edit Costs for Graph Edit Distance introduces a novel solution to these challenges. Authored by Francesco Leonardi, Markus Orsi, Jean-Louis Reymond, and Kaspar Riesen from the University of Bern, this work presents a new Graph Neural Network (GNN) framework designed to approximate GED.

Introducing GEDAN: A Flexible and Interpretable Approach

The GEDAN framework stands out because it can learn the edit costs directly, making the Graph Edit Distance more consistent with the actual properties of the graphs. This is a significant improvement over methods that use fixed, uniform costs. The learned cost function can even offer new insights into complex graph structures, which is particularly valuable in areas like molecular analysis and discovering structural patterns.

One of GEDAN’s core innovations is its ability to operate in both supervised and unsupervised settings. In the unsupervised mode, called U-GEDAN, the model can optimize its internal representations without needing “ground-truth” distances – meaning it doesn’t require pre-calculated exact GED values for training. This is a major breakthrough, as obtaining these exact distances for large datasets is often computationally prohibitive and a bottleneck for many existing methods. U-GEDAN achieves this by leveraging the inherent property that the Graph Edit Distance is zero only when two graphs are identical, using this as an intrinsic learning signal.

For situations where ground-truth distances are available, the supervised version, S-GEDAN, comes into play. S-GEDAN is designed to learn the specific costs associated with node and edge transformations. It integrates a Generalized Additive Model (GAM), which allows for flexible and interpretable learning of these context-aware edit costs. This means the model can adaptively weigh the importance of different structural scales and dynamically adjust costs based on the learned representations of nodes.

How GEDAN Works Under the Hood

GEDAN reformulates the GED problem into a Quadratic Assignment Problem. It uses a message-passing mechanism from GNNs to introduce penalties for dissimilar node representations, considering not just individual nodes but also their surrounding structures. A key component is the use of a pre-trained Gumbel-Sinkhorn network, which helps in approximating the optimal assignment between graph substructures. Unlike previous methods, GEDAN learns edit costs directly through gradient descent, making the GED calculation more sensitive to actual graph properties.

The S-GEDAN model, which learns these costs, uses a multi-task learning strategy involving two types of loss functions. One is a contrastive loss that helps the model rank distances, ensuring that similar graphs have lower GEDs and dissimilar ones have higher GEDs. The second loss function supervises a downstream task, such as predicting a molecule’s property (regression) or classifying it (classification).

Experimental Results and Real-World Impact

The researchers tested GEDAN on various molecular datasets, including AIDS, MUTAG, and PTC MR, comparing its performance against several state-of-the-art GED approximation models. The results showed that S-GEDAN performs competitively, often achieving similar or better accuracy. While U-GEDAN generally performed slightly less accurately than supervised methods, its ability to learn without ground-truth distances is a significant advantage, making it highly practical for large-scale applications where exact GEDs are hard to compute.

A particularly insightful aspect of the research is the analysis of the learned edit costs. For instance, in molecular analysis, GEDAN can highlight specific substructures that contribute most significantly to a target property, such as hydration free energy. The model correctly identified that substituting an iodine atom with a bromine atom in a molecule leads to a small but expected change in energy. It also demonstrated an understanding of chemical symmetry and consistently identified functional groups known to influence properties like water solubility, providing more interpretable results than some other methods.

Also Read:

Looking Ahead

While GEDAN represents a substantial step forward, the authors acknowledge certain limitations. The model currently incurs higher computational complexity due to its matrix operations and padding requirements, which can limit the size of graphs it can efficiently process (currently up to 64 nodes, though theoretically capable of thousands). Additionally, the current model uses a single edge type, which might restrict its ability to capture the full complexity of molecular structures that often involve multiple bond types. Future work aims to address these scalability and expressiveness limitations.

Overall, GEDAN offers a promising new direction for graph similarity computation, providing a more adaptable, interpretable, and practical approach to Graph Edit Distance, especially in data-scarce or computationally challenging scenarios.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

GEDAN: A New Framework for Understanding Graph Similarity with Learned Costs

Introducing GEDAN: A Flexible and Interpretable Approach

How GEDAN Works Under the Hood

Experimental Results and Real-World Impact

Looking Ahead

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates