spot_img
HomeResearch & DevelopmentQuantization and Fairness: A Deep Dive into Disparate Impacts...

Quantization and Fairness: A Deep Dive into Disparate Impacts and Solutions

TLDR: Post-Training Quantization (PTQ), a common method for compressing neural networks, can unintentionally worsen fairness issues, especially for minority groups. This paper explains the underlying reasons, tracing the impact from changes in model weights and activations to altered logits, softmax probabilities, and a degraded optimization state. To counter these effects, the authors propose a combined approach using mixed-precision Quantization Aware Training (QAT) with dataset sampling and weighted loss functions, demonstrating improved fairness without significant accuracy loss.

In the rapidly evolving world of artificial intelligence, the demand for faster and lighter models, especially for devices at the ‘edge’ of networks, has led to the widespread adoption of compression techniques like quantization. One such method, Post Training Quantization (PTQ), is celebrated for its ability to significantly reduce model size and speed up computation with minimal impact on overall accuracy. However, recent research has unveiled a critical, often overlooked, side effect: PTQ can exacerbate disparate impacts, particularly for minority groups.

A new paper, titled EXPLAININGHOWQUANTIZATIONDISPARATELY SKEWS AMODEL, by Abhimanyu Bellam and Jung-Eun Kim from North Carolina State University, delves deep into the mechanisms behind this fairness degradation. Their work provides a comprehensive explanation of how quantization creates a chain of factors leading to unequal impacts across different groups during both the forward and backward passes of a neural network.

The Root of the Problem: How Quantization Skews Models

The researchers observed that as the precision of a model is reduced through quantization (e.g., from 32-bit to 2-bit integers), the disparity in accuracy between groups becomes increasingly pronounced. For instance, on datasets like UTKFace, minority groups showed extreme drops in accuracy when models were quantized to lower precisions.

The study identifies several cascaded factors in the forward pass that contribute to this disparity:

  • Changes in Weights: The fundamental alteration occurs in the network’s weights. Quantization not only changes the numerical values of these weights but also induces sparsity, effectively setting many weights to zero. This is akin to pruning and leads to a loss of information, with lower precision causing a greater absolute difference from the original weights and increased sparsity.

  • Impact on Logits and Probabilities: These weight changes ripple through the network, significantly affecting the ‘logits’ – the raw output values from the network before they are converted into probabilities. The numerical values of logits shift, potentially causing incorrect classifications. More critically, the variance among logits decreases, making it harder for the model to distinguish between different classes, especially for minority groups. This reduced variance then carries over to the ‘softmax probabilities,’ which represent the model’s confidence in its predictions. For minority groups, these probabilities tend to shift closer to the decision boundary, indicating lower confidence and increased uncertainty.

  • Increased Loss and Compromised Accuracy: The combined effect of these changes is a higher loss and significantly reduced accuracy for minority groups, directly reflecting the exacerbated disparity.

The Optimization Landscape: A Deeper Look at Unfairness

Beyond the forward pass, the paper also examines how quantization degrades the model’s state from an optimization perspective. Using gradient norms and eigenvalues of the Hessian matrix, the researchers provide insights into why quantized models struggle with fairness:

  • Gradient Norms: For minority classes, quantized models exhibit larger gradient norms. In simpler terms, this means the model is further away from an optimal solution for these groups, implying a greater need for updates to improve predictions. There’s an inverse relationship observed between gradient norm and group size, meaning smaller groups have larger gradient norms.

  • Hessian Eigenvalues: The largest eigenvalues of the Hessian matrix are also higher for minority groups. This indicates a steeper loss surface for these groups, suggesting that while there’s a greater potential for loss reduction with updates, the model is currently in a less stable or optimal position for them.

Also Read:

Towards Fair Quantization: Proposed Mitigation Strategies

To combat these adverse effects, Bellam and Kim propose a multi-pronged mitigation approach:

  • Fairer Base Model: Before quantization, the base model can be made fairer using dataset sampling methods (undersampling majority classes, oversampling minority classes) to address data imbalance. Additionally, a weighted cross-entropy loss function can be employed, assigning higher weights to ‘harder’ classes (often minority groups) to ensure the model doesn’t solely focus on easier samples.

  • Mixed-Precision Quantization Aware Training (QAT): Unlike PTQ, QAT involves retraining the model with quantized weights, allowing the network to adapt. The researchers specifically advocate for mixed-precision QAT, where critical layers (like the first and last) use higher precision (e.g., 8-bit) while others use lower, minimizing information loss where it matters most.

  • FairQAT: The most effective solution combines all these elements: dataset sampling, weighted loss functions, and mixed-precision QAT. This integrated approach significantly reduces the disparate impact of quantization, achieving both higher overall accuracy and lower fairness violation, offering a balanced trade-off for practical deployment.

This research sheds crucial light on the hidden fairness challenges posed by model compression techniques. By understanding the ‘how’ and ‘why’ of quantization’s disparate impact, and by implementing the proposed FairQAT strategies, developers can move towards deploying more equitable and high-performing AI models on edge devices and beyond.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -