Unpacking AI Explanations: A New Approach to Understanding Uncertainty in Healthcare Models

TLDR: The research paper “UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles” introduces a novel framework to quantify and decompose uncertainty in SHAP values, a popular AI explainability technique. Traditional SHAP values are point estimates, ignoring inherent aleatoric (data noise) and epistemic (lack of knowledge) uncertainties. UbiQTree integrates Dempster-Shafer evidence theory, Liu’s uncertainty theory, and Dirichlet process hypothesis sampling to break down SHAP variance into aleatoric, epistemic, and entanglement components. This allows for a deeper understanding of explanation reliability, especially crucial in high-stakes domains like healthcare, by identifying features with unstable impacts and guiding data acquisition for more robust AI models.

Artificial intelligence (AI) is rapidly transforming healthcare, aiding in everything from disease diagnosis to risk assessment. However, the complexity of these AI models, especially those based on tree ensembles like Random Forests, often makes their decisions difficult to understand. This ‘black box’ nature can be a significant barrier, particularly in high-stakes fields like medicine where trust and interpretability are paramount.

To address this, techniques like SHapley Additive exPlanations (SHAP) have emerged. SHAP values help explain how each feature in a dataset contributes to an individual prediction made by an AI model. For instance, in a model predicting heart disease, SHAP could show how much a patient’s age or cholesterol level influenced the prediction. However, a major limitation of traditional SHAP values is that they are often treated as single, fixed numbers, ignoring the inherent uncertainty present in both the data and the model itself.

This uncertainty comes from two main sources: aleatoric and epistemic. Aleatoric uncertainty is like the unavoidable noise in the data – it’s there no matter how much information you collect. Epistemic uncertainty, on the other hand, arises from a lack of knowledge or insufficient data. Imagine a medical AI model giving the same SHAP values for a diagnosis across different hospitals, even if the patient demographics or disease prevalence vary significantly. This highlights a critical gap: the model might appear confident, but its explanation could be unstable due to unseen data variations.

A new research paper, UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles, proposes a novel approach to tackle this challenge. Authored by Akshat Dubey, Aleksandar Anžel, Bahar İlgen, and Georges Hattab, the paper introduces a framework that not only quantifies uncertainty in SHAP values but also breaks it down into its aleatoric, epistemic, and an ‘entanglement’ component. This entanglement term captures the complex interplay between data and model uncertainties, which is often overlooked by simpler methods.

How UbiQTree Works

UbiQTree integrates three powerful theoretical frameworks: Dempster-Shafer evidence theory, Liu’s uncertainty theory, and Dirichlet process hypothesis sampling. Think of it as a multi-faceted lens to view uncertainty:

Dempster-Shafer Evidence Theory: This allows UbiQTree to represent ambiguity in SHAP values. Instead of a single number, it provides a range of possibilities, quantifying both the minimum certainty (belief) and maximum possibility (plausibility) for a feature’s impact. If there’s a big gap between belief and plausibility, it signals high ambiguity, prompting human experts to review.
Liu’s Uncertainty Theory: This part helps model epistemic uncertainty by looking at the spread of SHAP values. A narrow spread means high confidence in a feature’s impact, while a wide, flat spread indicates high uncertainty. This can even guide where to collect more data to reduce uncertainty.
Dirichlet Process Hypothesis Sampling: This is a sophisticated statistical method that allows UbiQTree to explore many plausible versions of the AI model. By analyzing how SHAP values change across these different model versions, it captures both the inherent randomness (aleatoric) and the uncertainty due to limited knowledge (epistemic) within the model itself.

By combining these, UbiQTree provides a comprehensive picture of uncertainty, allowing users to understand not just ‘what’ a feature contributes, but ‘how reliably’ it contributes.

Practical Implications for Healthcare

The researchers validated UbiQTree across three real-world healthcare datasets: MIMIC-III (for predicting hospital length of stay), an Ovarian Cancer dataset (for risk categorization), and a SEER Breast Cancer dataset (for survival outcomes).

The results were insightful. For instance, in the MIMIC-III dataset, features like ‘NumTransfers’ (number of patient transfers within the hospital) and ‘NumNotes’ (number of clinical notes) consistently showed high importance for predicting length of stay. However, UbiQTree revealed that while these features were impactful, there was often significant epistemic uncertainty regarding the precise magnitude of their impact. This means the model relied on them, but its ‘confidence’ in the exact strength of their influence varied across different model versions.

Conversely, features like ‘gender’ or ‘ExpiredHospital’ (whether a patient died in the hospital) often had low average SHAP values but also very low uncertainty. This suggests they played a stable, albeit minor, role in predictions. The framework also provides ‘sign stability’ metrics, indicating whether a feature consistently pushes the prediction in the same direction (positive or negative) across different model variations.

This level of detail is crucial in healthcare. If a feature’s impact is highly uncertain, even if it has a high average SHAP value, it might warrant further investigation by a domain expert before making critical decisions. For example, if ‘AdmitProcedure’ shows high uncertainty for a ‘No Admit’ class, it could mean the model isn’t consistently sure how that procedure influences the decision not to admit, despite its overall importance. This could prompt clinicians to look for more data or refine the model.

Also Read:

Conclusion

UbiQTree represents a significant step forward in making Explainable AI more reliable and trustworthy. By decomposing SHAP uncertainty into its fundamental components, it moves beyond simple point estimates to provide a nuanced understanding of why an AI model makes a particular prediction and how confident that explanation is. This is especially vital in high-stakes domains like healthcare, where understanding the reliability of AI explanations can directly impact patient outcomes and guide the development of more robust and transparent AI systems, aligning with growing demands for auditable AI frameworks.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unpacking AI Explanations: A New Approach to Understanding Uncertainty in Healthcare Models

How UbiQTree Works

Practical Implications for Healthcare

Conclusion

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates