Medico 2025 Challenge: Advancing Explainable AI for Gastrointestinal Imaging

TLDR: The Medico 2025 challenge focuses on developing Explainable AI (XAI) models for Visual Question Answering (VQA) in gastrointestinal imaging. It aims to create AI systems that not only accurately answer clinical questions based on endoscopy images but also provide clear, interpretable justifications aligned with medical reasoning. The challenge uses the Kvasir-VQA-x1 dataset and includes two subtasks: one for AI performance in VQA and another for generating clinician-oriented multimodal explanations, with human expert evaluation for the latter.

The field of Artificial Intelligence (AI) continues to make significant strides, particularly in healthcare. A new initiative, the Medico 2025 challenge, is set to push the boundaries of AI in gastrointestinal (GI) imaging, focusing on a crucial aspect often overlooked: explainability. Organized as part of the MediaEval tasks series, this challenge aims to develop AI models that can answer clinically relevant questions based on GI endoscopy images, while also providing clear, interpretable justifications that align with medical reasoning.

Gastrointestinal diseases are a major global health concern, with conditions like Colorectal Cancer requiring early diagnosis. While AI-driven systems show great promise in assisting clinicians, their ‘black-box’ nature often limits their adoption in clinical practice. This is where Explainable Artificial Intelligence (XAI) comes in. XAI methods aim to make AI decisions transparent, building trust and enabling healthcare professionals to understand why a system makes a particular diagnosis or recommendation.

The Medico 2025 challenge, building on previous successful Medico editions, specifically addresses Visual Question Answering (VQA) in GI imaging with a strong emphasis on multimodal explanations. Medical VQA combines computer vision and natural language processing to answer questions directly from medical images. The challenge encourages participants to develop models that not only provide accurate answers but also offer clear justifications, ensuring the reliability of AI-generated insights.

The challenge is structured into two main subtasks. Subtask 1, titled ‘AI Performance on Medical Image Question Answering,’ challenges participants to create AI models that accurately interpret and respond to clinical questions based on GI images. This subtask utilizes the Kvasir-VQA-x1 dataset, which is a substantial benchmark comprising 6,500 GI endoscopy images and an impressive 159,549 complex question-answer pairs. Questions in this dataset span six categories, including Yes/No, Single-Choice, Multiple-Choice, Color-Related, Location-Related, and Numerical Count, requiring models to process both visual and textual information. Performance is evaluated using standard language quality metrics such as BLEU, ROUGE, and METEOR, with assessments stratified by overall performance, question category, and complexity level.

Subtask 2, ‘Clinician-Oriented Multimodal Explanations in GI,’ builds directly on the first. Here, participants are required to justify their model’s predictions using multiple complementary forms of reasoning. The goal is to generate rich, multimodal explanations that are transparent, understandable, and trustworthy for clinicians. At a minimum, explanations must include a detailed textual narrative in clinical language that directly supports the predicted answer. Participants are also strongly encouraged to provide an accompanying visual explanation, such as a heatmap, segmentation mask, or bounding box, that clearly links to the textual reasoning and highlights the relevant findings. Optional confidence scores can also be included. Crucially, all outputs in Subtask 2 are human-evaluated by domain experts and medical professionals based on criteria like clarity, coherence between modalities, and medical relevance, ensuring the explanations truly support clinical decision-making.

The Kvasir-VQA-x1 dataset, central to this challenge, is an extension of the original Kvasir-VQA. It features GI endoscopic images from HyperKvasir and Kvasir-Instrument, with QA pairs stratified by reasoning complexity (Level 1 for single atomic QAs, Level 2 for two merged QAs, and Level 3 for synthesis across three atomic QAs). Each QA pair is also assigned one or more ‘question_class’ labels, such as polyp type or instrument presence, allowing for fine-grained analysis. The dataset is publicly available for researchers to access and use for reproducible experimentation.

Also Read:

The Medico 2025 challenge represents a significant step towards integrating powerful deep learning models into clinical settings. By emphasizing explainable VQA for GI imaging, it promotes the development of AI models that are not only accurate but also provide transparent justifications aligned with medical reasoning. This initiative fosters interdisciplinary collaboration between AI and medical communities, paving the way for clinically viable AI tools that are both trusted and actionable in real-world healthcare scenarios. For more detailed information, you can refer to the research paper: Medico 2025: Visual Question Answering for Gastrointestinal Imaging.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Medico 2025 Challenge: Advancing Explainable AI for Gastrointestinal Imaging

Gen AI News and Updates

Microsoft Research Unveils Project Gecko to Advance Equitable Multilingual AI for Global Communities

Get Well and RhythmX AI Unite to Form GW RhythmX, Pioneering AI-Native Healthcare Intelligence

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates