TLDR: This research introduces a novel approach to improve the efficiency of Graph Neural Networks (GNNs) by using Shapley values for graph sparsification. Unlike traditional methods that often only consider positive importance, Shapley values assign both positive and negative contributions to edges, allowing for more effective pruning of less important or even detrimental connections. This method significantly reduces computational complexity while maintaining or even improving GNN prediction accuracy, offering a more interpretable and efficient inference process.
Graph Neural Networks, or GNNs, have become incredibly powerful tools for analyzing data structured as graphs, such as social networks, recommendation systems, and even molecular structures. They excel at uncovering hidden relationships within data, leading to improved predictive models. However, as real-world graphs grow larger and more complex, the computational demands and memory requirements for GNNs can become a significant challenge, especially on devices with limited resources.
To address this, researchers have developed graph sparsification techniques. These methods aim to reduce the size of a graph by removing edges that have minimal impact on the GNN’s predictions. Think of it like decluttering a messy room – you want to remove unnecessary items without affecting the room’s core function. Many existing sparsification methods are inspired by the “Lottery Ticket Hypothesis,” which suggests that a smaller sub-network within a GNN can perform as well as the full model. While effective, these often require retraining the model for different levels of sparsification, which can be time-consuming.
Another class of methods uses GNN explanation techniques to identify and remove uninformative parts of the graph. These explainers assign importance scores to edges, indicating their contribution to a node’s prediction. However, a common limitation is that many of these methods only produce non-negative scores. This means an edge that actually harms a model’s prediction might still be considered “important” because its removal causes a large change in the prediction, even if that change is an improvement. This can lead to suboptimal pruning strategies.
This new research, detailed in the paper Shapley-Value-Based Graph Sparsification for GNN Inference, proposes a more sophisticated approach using Shapley values. Shapley values, rooted in game theory, offer a theoretically sound way to fairly distribute credit (or blame) among “players” in a collaborative game. In the context of GNNs, edges are the players, and their “contribution” to a node’s prediction is evaluated. Crucially, Shapley values can assign both positive and negative contributions. This means they can identify edges that genuinely help the prediction (positive contribution) as well as those that hinder it (negative contribution).
By leveraging this unique ability, the researchers developed a sparsification strategy that not only preserves influential edges but also intelligently removes misleading or even adversarial connections. Their method, primarily using GNNShap, a scalable Shapley-value-based explainer, works by first computing local importance scores for each edge for various node predictions. These local scores are then aggregated into a single global score for each edge. Finally, edges are pruned based on these aggregated scores until a desired level of sparsity is achieved.
The experimental results are compelling. Across multiple datasets and GNN models (GCN and GAT), the Shapley-value-based sparsification consistently maintained higher test accuracy even at very high levels of edge removal. For instance, on the Cora dataset, 80% of edges could be pruned with less than a 2% drop in accuracy. This significant reduction in graph complexity translates directly into substantial computational savings during GNN inference, measured by Multiply-Accumulate operations (MACs). In some cases, MACs were reduced by over 60%.
Compared to existing graph lottery ticket methods, the Shapley-value approach demonstrated superior sparsification ratios. A key advantage is that it eliminates the need to retrain the model for every desired sparsity level, and it’s not limited by the amount of labeled training data, as it learns edge importance for predicted classes. While the method cannot sparsify model weights, the authors suggest that starting with a smaller model and gradually increasing its size is often more practical than repeatedly training a large model to find optimal sparsity.
Also Read:
- Securing Graph Data: A New Framework for Privacy-Preserving Structure Learning
- A New Approach to Updating Knowledge Graphs: GraphDPO for Smarter Unlearning
An ablation study further underscored the importance of considering both positive and negative attribution scores. When only non-negative scores were used for pruning, the effectiveness of the sparsification significantly decreased, confirming that negatively attributed edges, if not removed, introduce noise and reduce pruning capability. In conclusion, this research highlights Shapley values as a powerful tool for making GNN inference more efficient, scalable, and interpretable by enabling more precise and effective graph sparsification.


