spot_img
HomeResearch & DevelopmentUnpacking Bias in AI Healthcare: Lessons from Data Collection...

Unpacking Bias in AI Healthcare: Lessons from Data Collection Practices

TLDR: A research paper by Anna Arias-Duart, Maria Eugenia Cardello, and Atia Cortés explores how biased data collection practices hinder the integration of AI in healthcare. Drawing from the AI4HealthyAging project, the study identifies historical, representation, and measurement biases related to sex, gender, age, habitat, socioeconomic status, equipment, and labeling. It provides practical recommendations, such as involving diverse teams, defining clear inclusion criteria, and evaluating data labeling, to improve fairness and robustness in clinical AI system design and data collection.

Artificial intelligence (AI) holds immense potential to revolutionize healthcare, from aiding diagnoses to informing clinical decisions. However, despite rapid advancements, the widespread adoption of AI solutions in real-world clinical settings remains surprisingly limited. A significant hurdle lies in the quality and fairness of the data used to train these AI systems, which are often compromised by biased data collection practices.

A recent research paper, “Bias by Design? How Data Practices Shape Fairness in AI Healthcare Systems,” delves into these critical issues. Authored by Anna Arias-Duart, Maria Eugenia Cardello, and Atia Cortés from the Barcelona Supercomputing Center (BSC), the paper draws insights from the AI4HealthyAging project, a national R&D initiative in Spain focused on developing AI solutions for age-related diseases. The project’s core task was to identify biases during clinical data collection across various use cases, including cardiovascular conditions, Parkinson’s disease, and hearing loss.

Understanding Bias in AI Healthcare

The term ‘bias’ in AI lacks a single, universally agreed-upon definition, but generally refers to systematic and unfair favoritism or prejudice that can lead to discriminatory outcomes. In healthcare, this means AI systems might perpetuate or even amplify existing health inequities, leading to detrimental impacts on certain individuals or groups. The authors emphasize that understanding bias as a normative issue—where outcomes are undesirable or unjust—is crucial for achieving health equity.

Biases can emerge at various stages of an AI system’s lifecycle, from the initial problem formulation and data collection to model development, system implementation, and post-deployment monitoring. This paper specifically focuses on biases arising during the crucial initial stages of data design and data collection.

Biases Identified in Practice

The researchers categorized the biases they found into three main types: historical, representation, and measurement biases, illustrating each with concrete examples from the AI4HealthyAging project:

Historical Biases: These biases stem from societal norms and systemic inequalities reflected in the data.

  • Sex Bias: In the Parkinson’s study, there was a lower representation of females in older age groups, reflecting biological differences in disease prevalence and mortality. Neglecting these sex-based differences can lead to models that don’t accurately capture disease nuances.
  • Gender Bias: Even without direct gender data, inferred gender scores can reveal biases. For instance, studies show women often receive less effective pain relief and more mental health referrals compared to men, highlighting how gender norms influence treatment and can be reinforced by biased data.

Representation Biases: These occur when certain groups are underrepresented or overrepresented in the dataset, leading to models that perform poorly for marginalized populations.

  • Age Bias: In studies for age-related conditions, control groups tended to be younger, while disease groups were older. This imbalance can cause models to mistakenly associate age-related features with disease presence rather than true disease markers.
  • Habitat Bias: Most participants came from urban areas because hospitals are typically located there. This excludes individuals from rural areas, creating a geographic bias that limits the generalizability of findings.
  • Socioeconomic Bias: Data collected from private hospitals, for example, primarily includes individuals from wealthier backgrounds. Similarly, differences in education levels (e.g., higher education in control groups for hearing loss studies) can introduce bias if not accounted for, potentially leading to misleading conclusions.

Measurement Biases: These biases arise from inconsistencies in how data is collected or labeled.

  • Equipment Bias: If data is collected using specific equipment (e.g., cochlear implants from one manufacturer), the model might be biased towards the characteristics of that device, limiting its applicability to users of other equipment.
  • Labeling Bias: Human judgment or institutional practices can influence data labels. An example from the hearing loss study was the initial omission of ‘homemakers’ as an occupational category, which significantly misrepresented women in the dataset until corrected.
  • Intersectional Bias: This occurs when multiple demographic variables interact. In an Alzheimer’s study, age and sex interactions (females being younger across diagnostic groups) could lead models to misattribute normative age-related sex differences to disease-specific changes if not properly controlled.

Recommendations for Fairer AI in Healthcare

To mitigate these biases, the paper offers practical recommendations:

  • Historical Bias: Involve diverse, interdisciplinary teams in planning data collection to minimize implicit biases. Collect data in aggregated or disaggregated ways as appropriate, carefully considering what metadata to include to avoid unintended harm.
  • Representation Bias: Define clear and balanced inclusion/exclusion criteria to ensure sample diversity. Analyze the need for intersectional benchmarks to better represent the target population. Ensure the sample size is feasible and sustainable, considering recruitment and retention challenges.
  • Measurement Bias: Thoroughly evaluate the data labeling process to ensure categories are clear and consistent, potentially involving interdisciplinary teams for socioeconomic variables or multiple professionals for clinical data. Crucially, consider the equipment used during data collection and the context of model deployment to prevent equipment-related biases.

Also Read:

Moving Towards Equitable AI

The authors conclude that successfully integrating AI systems into healthcare requires addressing bias not just as a technical challenge, but as a fundamental governance issue. By highlighting how various forms of bias can emerge during data collection and offering concrete recommendations, this research aims to guide future healthcare AI projects in building more equitable, effective, and socially responsible systems. For more details, you can read the full paper here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -