Uncovering Hidden Biases: How AI Models Struggle with Intersectional Identities

TLDR: A new study introduces WinoIdentity, a benchmark to evaluate intersectional bias in Large Language Models (LLMs) by measuring ‘Coreference Confidence Disparity.’ It finds that LLMs exhibit significant uncertainty and bias (up to 40% disparity) towards doubly-disadvantaged identities, especially in anti-stereotypical contexts. The research suggests LLMs rely more on memorization than reasoning, leading to both fairness and validity concerns with real-world discriminatory implications.

Large Language Models, or LLMs, have become incredibly powerful tools, helping us with everything from writing emails to making important decisions in fields like hiring and university admissions. However, there’s a growing concern that these AI systems can unintentionally carry and even amplify existing societal biases, leading to unfair outcomes for certain groups of people.

Previous research has done a good job of looking at bias along single lines, like just gender or just race. But this new research takes a crucial step forward by examining what’s called ‘intersectional bias.’ This means looking at how different forms of discrimination, such as gender, race, and socio-economic status, can overlap and create unique patterns of disadvantage. For example, the experience of a Black woman is different from that of a White woman or a Black man, because multiple aspects of her identity intersect.

To study this complex issue, researchers developed a new benchmark dataset called WinoIdentity. They built upon an existing dataset, WinoBias, and expanded it significantly. WinoIdentity now includes 25 different demographic markers across 10 attributes, such as age, nationality, and race, all combined with binary gender. This massive dataset, comprising 245,700 prompts, allows for the evaluation of 50 distinct bias patterns in LLMs.

Instead of just looking at whether a model makes an error, this study introduces a new way to measure unfairness: ‘Coreference Confidence Disparity.’ This metric assesses how confident an LLM is when resolving pronouns (like ‘he’ or ‘she’) in sentences, especially when those pronouns refer to people with different intersectional identities. The idea is that if a model is consistently less confident about certain identities, it suggests a form of ‘harm of omission’ – where the model is uncertain, not necessarily wrong, but still less reliable for those groups.

The findings from evaluating five recently published LLMs were quite striking. The study revealed confidence disparities as high as 40% across various demographic attributes, including body type, sexual orientation, and socio-economic status. Worryingly, models showed the most uncertainty when dealing with ‘doubly-disadvantaged’ identities in situations that go against stereotypes. An example of this would be when the model had to assign transgender women to historically male-dominated occupations.

Interestingly, the research also found that even for ‘privileged’ or ‘hegemonic’ markers (like ‘White’ or ‘cisgender’), coreference confidence decreased when these markers were added. This suggests that the impressive performance of LLMs might rely more on memorizing patterns from their training data rather than truly understanding and reasoning about language. This points to two separate but compounding issues: a failure in ‘value alignment’ (the model being unfair) and a failure in ‘validity’ (the model not genuinely reasoning).

The implications of these findings are significant. Such systematic errors could lead to real-world discrimination, for instance, in hiring processes where AI systems might unfairly down-rank applications that mention phrases like ‘Black Feminist Scholars’ or ‘Neurodivergent in AI Affinity Group.’ While current bias mitigation strategies often involve adding more diverse examples to training data, the authors suggest this might be a temporary fix that relies on memorization rather than addressing the fundamental lack of reasoning ability.

Also Read:

This research, detailed further in the paper Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution, highlights the ongoing challenge of building truly fair and robust AI systems. It emphasizes that fairness is a complex concept that goes beyond simple mathematical metrics and requires careful consideration of how AI interacts with diverse human identities.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Uncovering Hidden Biases: How AI Models Struggle with Intersectional Identities

Gen AI News and Updates

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

A New Way to Disentangle Data for Scientific Exploration

SiegPath Honored with ‘Most Innovative Fintech Award’ at AI Expo Europe 2025 for AI-Driven Solutions

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates