TLDR: FracAug is an innovative plug-in augmentation framework designed to improve Graph Neural Networks (GNNs) for graph-level anomaly detection (GAD) in situations with limited labeled data and imbalanced datasets. It features a Fractional Graph Generator (FGG) that creates semantically consistent graph variants using fractional powers of adjacency matrices, guided by a Weighted Distance-Aware Margin Loss (WDML) to handle data imbalance. A Mutual Verification Pseudo-Labeler (MVP) then reliably expands the training set by pseudo-labeling unlabeled data. FracAug is model-agnostic and has shown consistent, significant performance gains across various GNNs and real-world datasets.
Graph-level anomaly detection (GAD) is a crucial task in many fields, from identifying unusual patterns in drug discovery to detecting anomalies among proteins. However, the effectiveness of Graph Neural Networks (GNNs) in these applications is often hampered by two significant challenges: the high cost of labeling data, which leads to limited supervision, and the inherent imbalance in datasets where anomalies are rare instances.
To tackle these issues, researchers have introduced FracAug, an innovative plug-in augmentation framework designed to enhance GNNs. FracAug works by generating semantically consistent graph variations and then using a clever pseudo-labeling technique with mutual verification to expand the training data.
Unlike many previous methods that rely on simple, heuristic modifications, FracAug learns the underlying semantics within given graphs. It then synthesizes ‘fractional variants’ of these graphs. This process is guided by a novel weighted distance-aware margin loss, which helps capture multi-scale topological information. This ensures that the generated graphs are diverse and preserve their original meaning, even when dealing with imbalanced datasets.
How FracAug Works: Three Key Components
FracAug is built upon three main components:
1. **Fractional Graph Generator (FGG):** This component is at the heart of creating new graph variants. It leverages the fractional power of adjacency matrices, which are mathematical representations of graph connections. By using fractional powers, FGG can introduce controlled structural variations while ensuring the new graphs maintain semantic consistency with their original counterparts. This is a more sophisticated approach than simple edge removals or additions, which can sometimes distort a graph’s meaning.
2. **Weighted Distance-Aware Margin Loss (WDML):** To address the problem of data imbalance, FracAug employs WDML. This loss function guides the FGG by assigning dynamic margins based on the intrinsic distance between a synthetic graph and its original. This ensures that the generated samples retain their original label with high confidence and helps the model learn to distinguish between normal and anomalous graphs more effectively, without being biased by the scarcity of anomaly examples.
3. **Mutual Verification Pseudo-Labeler (MVP):** Once the GNN produces predictions for both the original and the synthetic graphs, MVP steps in. It uses a mutual verification mechanism to pseudo-label unlabeled data. This means it checks for agreement between the predictions from the original graph and its fractional variant. By doing so, it minimizes pseudo-labeling errors and reliably expands the training set, mitigating the issue of limited supervision.
Also Read:
- GraphUniverse: A New Framework for Evaluating Graph Model Generalization
- Advanced AI System Detects Money Laundering with High Accuracy
Remarkable Universality and Efficacy
FracAug operates as a model-agnostic module, meaning it can be seamlessly integrated with various GNN architectures without requiring any architectural modifications. Extensive experiments conducted across 14 different GNNs on 12 real-world datasets have demonstrated its remarkable universality and efficacy. FracAug consistently boosts average AUROC, AUPRC, and F1-score metrics by significant margins, outperforming existing graph augmentation approaches.
This framework represents a significant step forward in making GNNs more robust and effective for graph-level anomaly detection, especially in scenarios where labeled data is scarce and class distributions are highly imbalanced. For more in-depth information, you can refer to the full research paper: FRAC AUG: FRACTIONAL AUGMENTATION BOOST GRAPH-LEVEL ANOMALY DETECTION UNDER LIMITED SUPERVISION.


