spot_img
HomeResearch & DevelopmentUncovering Hidden Biases: How LLMs Infer Demographics from Neutral...

Uncovering Hidden Biases: How LLMs Infer Demographics from Neutral Questions

TLDR: A new research paper introduces DAIQ, a framework to audit how Large Language Models (LLMs) infer user demographic attributes like gender and race from questions that contain no explicit demographic information. The study found that many LLMs, both open and closed source, frequently make these inferences based on subtle linguistic cues and often default to ‘Male’ and ‘White’ attributions, reinforcing societal stereotypes. The researchers also developed a prompt-based guardrail that significantly reduces these unwarranted inferences, highlighting the need for better auditing and mitigation strategies in AI.

Large Language Models (LLMs) are powerful tools, but they often reflect societal biases when demographic information like gender or race is explicitly provided in the input. A recent research paper, DAIQ: Auditing Demographic Attribute Inference from Question in LLMs, sheds light on a more subtle and concerning issue: LLMs inferring user identities even when questions lack any explicit demographic cues.

This overlooked behavior poses significant risks. It violates expectations of neutrality, infers unintended demographic information, and can encode stereotypes that undermine fairness in critical sectors such as healthcare, finance, and education. The researchers, Srikant Panda, Hitesh Laxmichand Patel, Shahad Al-Khalifa, Amit Agarwal, Hend Al-Khalifa, and Sharefah Al-Ghamdi, introduce a new task and framework called Demographic Attribute Inference from Questions (DAIQ) to systematically audit this failure mode.

Understanding DAIQ

The DAIQ framework is designed to investigate how LLMs infer user demographic attributes, specifically gender (Male or Female) and race (Black or White), solely from the phrasing and topic of a question, without any explicit demographic signals. Unlike humans, who would typically abstain from judgment in the absence of clear cues, LLMs tend to generate statistically probable outputs based on their training data. For example, a question about solo travel might lead an LLM to infer a female author due to perceived gendered safety concerns, or a white author due to assumptions about societal freedom to travel alone.

How the Study Was Conducted

The researchers used 212 ‘Neutral Queries’ from the Accesseval benchmark, covering six real-world domains: Education, Finance, Healthcare, Hospitality, Media, and Technology. They evaluated 19 instruction-tuned LLMs, including both open-source and closed-source models. Each model was prompted in a two-stage process: first, to use chain-of-thought reasoning to identify any demographic clues, and then to infer a likely gender and race with justifications.

A key metric introduced was the ‘Response Rate,’ which quantifies how frequently a model assigns a specific gender or race rather than abstaining or responding with ‘Not known’ or ‘Neutral.’ A lower response rate indicates greater caution and less risk of unintended bias.

Key Findings on LLM Behavior

  • Prevalence of Inference: Both open and closed-source LLMs were found to assign demographic labels based purely on question phrasing.
  • Model Differences: Closed-source models like Claude Haiku and Cohere Command generally showed lower response rates (more cautious), while OpenAI models (GPT-4.1, GPT-4o) had high response rates. Open-source models varied widely.
  • Gender vs. Race: Most models had a higher response rate for gender inference than for race, suggesting slightly more caution regarding racial attribution.
  • Default Attributions: A significant bias was observed, with most models predominantly attributing determined gender responses to ‘Male’ and race attributions overwhelmingly skewed towards ‘White.’ This suggests ‘Male’ and ‘White’ often become the default in the absence of explicit cues.
  • Stereotype Reinforcement: Qualitative analysis revealed that models often relied on stereotypes. For instance, ‘Male’ attributions clustered in high-status domains like finance and technology, while ‘Female’ attributions appeared in hospitality, education, and healthcare, often justified by traits like empathy or caregiving.
  • Caution vs. Confidence: Some models, like Phi-4-mini-instruct, exhibited very low response rates (3%), demonstrating strong guardrails. In contrast, GPT-4.1 predicted gender and race in 100% of prompts, even when admitting ‘no explicit clues,’ highlighting a tension between ethical safeguards and training data influence.
  • Response Length Variation: The study also found that some models generated responses with statistically significant differences in output length depending on the inferred demographic attribute, indicating internal representational biases.

Mitigation Strategy

To address these biases, the researchers developed a prompt-based guardrail. This intervention is designed to prevent the model from making demographic assumptions when such information is not explicitly provided. The guardrail significantly reduced unwarranted identity attribution across many models, promoting fairness and privacy by encouraging models to abstain from inference in ambiguous cases.

Also Read:

Conclusion

The DAIQ framework reveals a systemic and underacknowledged risk: LLMs can fabricate demographic identities, reinforce societal stereotypes, and propagate harms that erode privacy, fairness, and trust. The findings underscore the critical need for rigorous auditing and the development of robust mitigation strategies to ensure the equitable and responsible deployment of LLMs in real-world applications. Future work will expand to other protected attributes and investigate intersectional biases.

Rhea Bhattacharya
Rhea Bhattacharyahttps://blogs.edgentiq.com
Rhea Bhattacharya is an AI correspondent with a keen eye for cultural, social, and ethical trends in Generative AI. With a background in sociology and digital ethics, she delivers high-context stories that explore the intersection of AI with everyday lives, governance, and global equity. Her news coverage is analytical, human-centric, and always ahead of the curve. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -