spot_img
HomeResearch & DevelopmentUnmasking Hidden Biases in Network Link Predictions

Unmasking Hidden Biases in Network Link Predictions

TLDR: The paper ‘Breaking the Dyadic Barrier: Rethinking Fairness in Link Prediction Beyond Demographic Parity’ argues that traditional ‘dyadic’ fairness metrics like demographic parity in link prediction can hide biases within subgroups and are inadequate for ranking-based tasks. It proposes a new framework with ‘non-dyadic distribution-preserving fairness’ and ‘rank awareness,’ introducing the NDKL metric. The authors also present MORAL, a post-processing algorithm that uses decoupled link predictors and a greedy aggregation strategy to achieve fairer rankings, demonstrating its effectiveness across various datasets.

Link prediction, the task of identifying missing or future connections in networks, is a cornerstone of graph machine learning. It has wide-ranging applications, from suggesting friends on social media to completing knowledge graphs. However, ensuring fairness in these predictions is crucial, as biased algorithms can unintentionally worsen existing societal inequalities.

Historically, fairness in link prediction has largely relied on a ‘dyadic’ definition. This approach typically categorizes links into two broad groups: ‘intra-group’ (connections within the same demographic group) and ‘inter-group’ (connections between different demographic groups). Fairness was then assessed by trying to achieve ‘demographic parity’ between these two large categories, meaning the probability of a link being predicted should be similar for both intra-group and inter-group connections.

However, new research from JoËœao Mattos, Debolina Halder Lina, and Arlei Silva from Rice University challenges this long-standing assumption. Their paper, titled “Breaking the Dyadic Barrier: Rethinking Fairness in Link Prediction Beyond Demographic Parity,” argues that this dyadic framework, while seemingly straightforward, can actually obscure significant underlying disparities. It allows systemic biases to go unnoticed, particularly within the aggregated subgroups.

The Limitations of Traditional Fairness Metrics

The researchers highlight several key limitations of the dyadic fairness approach and demographic parity:

  • Hidden Biases Within Subgroups: The dyadic framework aggregates different types of intra-group connections. For example, if a network has two sensitive groups (say, men and women), intra-group links include both male-male and female-female connections. Demographic parity might show overall fairness between intra-group and inter-group links, but it could completely miss a scenario where male-male connections are systematically predicted more often than female-female connections. This creates a ‘glass ceiling’ effect where underrepresentation of specific subgroups remains undetected.
  • Insensitivity to Ranking: Link prediction in real-world applications often involves ranking potential connections. Users typically pay more attention to top-ranked suggestions. Demographic parity, however, is ‘ranking-insensitive.’ A model could produce a ranking heavily dominated by one type of connection at the top, yet still be deemed ‘fair’ by demographic parity if the overall proportions are met. This leads to ‘exposure bias,’ where certain groups receive disproportionately less visibility in prominent positions.
  • Limited Expressive Power of Models: Many fair link prediction methods rely on Graph Neural Networks (GNNs) to create ‘fair’ representations of nodes. However, the paper points out that standard GNNs often have limited expressive power, struggling to distinguish subtle structural and demographic asymmetries between node pairs. This means that even if node representations are considered fair, it doesn’t automatically translate to fair link predictions.

A New Framework for Fairness

To address these critical limitations, the researchers propose a more expressive framework for fairness assessment in link prediction, built on two key properties:

  • Non-Dyadic Distribution-Preserving Fairness: This property suggests that a fairness metric should treat all sensitive attribute pairings (e.g., male-male, female-female, male-female) as distinct subgroups. The goal is to align the predicted distribution of these specific edge types with their observed distribution in the original graph, rather than aggregating them into broader intra/inter categories.
  • Rank Awareness: A fairness metric must be sensitive to both the proportion and the rank of every type of pair. It should ensure that no group systematically receives higher or lower ranks compared to others, especially in the crucial top positions where user attention is concentrated.

Inspired by information retrieval literature, the paper adopts the Normalized Cumulative KL-Divergence (NDKL) as its primary fairness metric. NDKL effectively penalizes deviations between the cumulative exposure of different group categories in the top-k ranked edges and their expected distribution, thereby capturing both exposure fairness and subgroup proportions.

Introducing MORAL: A Post-Processing Solution

To mitigate bias under this new, more rigorous fairness framework, the researchers introduce MORAL (Multi-Output Ranking Aggregation for Link fairness). MORAL is a simple yet scalable post-processing algorithm that can be combined with any existing link prediction model.

The core idea behind MORAL is to decouple predictions: it trains three distinct link prediction models, one for each sensitive group interaction type (e.g., male-male, female-female, male-female). This prevents a single model from exhibiting imbalanced performance across groups. At inference time, MORAL intelligently aggregates predictions from these group-specific models into a unified ranking. It does this by greedily selecting the highest-scoring remaining edge from the model whose inclusion best reduces the cumulative KL divergence from a predefined target distribution. This ‘exposure-aware’ ranking ensures that the final output closely approximates the desired group-wise exposure proportions.

Also Read:

Experimental Validation

The researchers conducted extensive experiments on six real-world datasets, including social networks (Facebook, Pokec), credit networks (Credit, German), and professional networks (NBA). They compared MORAL against a diverse set of existing fair link prediction approaches, evaluating them using NDKL for fairness and precision@k for utility.

The results demonstrate that MORAL consistently achieves the fairest rankings (lowest NDKL scores) across all datasets, while still maintaining high link prediction performance. Crucially, the experiments empirically confirm that traditional demographic parity metrics often fail to detect significant hidden biases within subgroups and are insensitive to exposure bias in ranked lists.

This work represents a significant step forward in understanding and addressing fairness in link prediction. By moving beyond simplistic dyadic assumptions and incorporating rank awareness, it provides a more robust and realistic assessment of algorithmic bias, paving the way for more equitable AI systems in network-based applications. For more details, you can read the full research paper here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -