AI's Cultural Blind Spot: How Language Models Misunderstand Ableism in India

TLDR: A study reveals that large language models (LLMs) struggle to accurately detect and interpret ableism across cultures, particularly in India. Western LLMs tend to overestimate ableist harm, while Indic LLMs underestimate it, often misinterpreting cultural nuances and overlooking intersectional biases. The research highlights that LLMs are less sensitive to ableism expressed in Hindi and fail to understand the differing perceptions of people with disabilities in India regarding microaggressions, pity, and the intersection of disability with gender, caste, and class. The findings call for a human-centered, culturally grounded approach to developing AI systems for harm detection.

A new research paper delves into a critical issue in the world of artificial intelligence: how large language models (LLMs) understand and address ableism, particularly across different cultures. The study, titled Disability Across Cultures: A Human-Centered Audit of Ableism in Western and Indic LLMs, highlights a significant gap in how these powerful AI systems perceive harm against people with disabilities (PwD), especially in non-Western contexts like India.

People with disabilities globally, and particularly in India, face high levels of discrimination and hate online. While LLMs are increasingly used to combat online hate, most research has focused on Western audiences and Western AI models. This raises a crucial question: are these models truly equipped to recognize ableist harm in diverse cultural settings, and do localized models perform any better?

How the Study Was Conducted

To investigate these questions, researchers adopted and translated a publicly available dataset of ableist speech into Hindi, including both informal and formal registers. They then prompted eight different LLMs—four developed in the U.S. (GPT-4, Gemini, Claude, Llama) and four developed in India (Krutrim, Nanda, Gajendra, Airavata)—to score and explain the level of ableism and toxicity in these comments on a scale of 0 to 10. In parallel, 175 people with disabilities from both the U.S. and India performed the same task, providing a human-centered benchmark for comparison.

Human Perceptions of Ableism: A Cultural Divide

The study revealed stark differences in how PwD in the U.S. and India interpreted ableism. Indian PwD generally rated toxicity and ableism higher than their U.S. counterparts. While U.S. participants often distinguished clearly between general toxicity and ableism, Indian PwD tended to focus more on the emotional harm inflicted by comments. Interestingly, microaggressive ableism, such as comments like “IT’S AMAZING HOW POSITIVE YOU ARE!”, were often perceived as highly ableist and patronizing by U.S. PwD, but were interpreted positively as encouragement by Indian PwD. This highlights differing cultural expectations around support and motivation.

AI Models’ Performance: Overestimation and Underestimation

The research found a significant misalignment between LLMs and human perceptions, especially those of Indian PwD. Western LLMs consistently overestimated ableist harm, often flagging comments as highly offensive that Indian PwD considered benign or even positive. For example, a comment about attending a charity for disability, seen as “inspiration porn” by Western LLMs, was viewed as “positive” and “motivating” by Indian PwD.

Conversely, Indic LLMs consistently underestimated ableist harm. They frequently failed to detect harmful stereotypes, misinterpreted ableist comments, or even dismissed invisible disabilities like depression and autism as not being “real” disabilities. This under-sensitivity means harmful content could remain unchecked on platforms.

The study also explored the impact of demographic prompting, where models were explicitly told to consider the Indian context. While most Western LLMs showed little change, some Indic models, like Nanda, actually became less sensitive to ableism when the Indian context was introduced, contradicting how Indian PwD perceived such harm.

Ableism in Hindi: A Lingual Blind Spot for AI

A crucial finding was the LLMs’ performance with Hindi language. While Indian PwD rated harm consistently across English and Hindi, Western LLMs rated toxicity and ableism significantly lower in Hindi. This suggests that these models are more tolerant of ableist content when it’s expressed in Hindi, potentially leaving Hindi-speaking PwD more vulnerable to harm.

Furthermore, the nuances of Hindi formality registers (casual vs. formal language) posed a challenge. Indian PwD often interpreted casual Hindi as more intimate and caring, even for potentially intrusive questions. However, LLMs frequently interpreted casual Hindi as more harmful or disrespectful, revealing a deep disconnect from local social norms.

Beyond the West: Unique Cultural Nuances

The explanations provided by Indian PwD highlighted unique cultural attitudes that LLMs failed to capture. Indian PwD expressed a strong aversion to pity, often reframing deeply ableist remarks through a lens of strength and resilience. They also described intense social pressure to appear “normal” and were perplexed by stereotypes accusing disabled people of faking their conditions.

The study also revealed how ableism in India intersects with other systemic inequalities like gender, caste, and class. Comments about reproductive health were particularly harmful to women with PCOS, and remarks about veganism were layered with assumptions about religion and economic privilege. These intersectional biases were entirely missed by the LLMs.

Also Read:

The Path Forward: Culturally Grounded AI

The findings underscore a significant cultural misalignment in AI systems designed for content moderation. Western LLMs, often trained on U.S.-centric data, may over-censor legitimate disability advocacy in other cultures, while Indic LLMs’ under-sensitivity allows harmful content to persist. The paper argues against a universal standard for ableism recognition, asserting that harm must be assessed through the lens of local values and lived experiences. It calls for a shift towards culturally grounded harm detection, emphasizing the need for researchers to collaborate with diverse end-users to build truly inclusive AI systems.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI’s Cultural Blind Spot: How Language Models Misunderstand Ableism in India

How the Study Was Conducted

Human Perceptions of Ableism: A Cultural Divide

AI Models’ Performance: Overestimation and Underestimation

Ableism in Hindi: A Lingual Blind Spot for AI

Beyond the West: Unique Cultural Nuances

The Path Forward: Culturally Grounded AI

Gen AI News and Updates

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

OpenAI Maintains Course on Sora 2 Amidst Public Citizen’s Deepfake and Copyright Warnings

India’s Evolving Workforce: The Dual Impact of Artificial Intelligence and Growing Female Engagement

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates