TLDR: A new AI system combines fine-tuned vision-language models, a structured reasoning layer, and agentic retrieval-augmented generation (RAG) to significantly improve dermatological diagnosis in telemedicine. The system mimics a dermatologist’s cognitive process by integrating patient images, symptom descriptions, and external medical knowledge. It provides accurate and explainable diagnoses, achieving competitive results in the ImageCLEF MEDIQA-MAGIC 2025 challenge, demonstrating a viable path for trustworthy AI in healthcare.
Remote dermatological diagnosis faces significant hurdles. When patients consult doctors remotely, they often provide images of varying quality and descriptions of symptoms that might be unclear or lack important medical details. Unlike in-person visits where doctors can ask follow-up questions, AI systems must make accurate diagnoses from a single, static interaction. This can lead to difficulties in building reliable models and increases the risk of incorrect diagnoses.
To tackle these challenges, researchers from Georgia Institute of Technology developed a new approach that prioritizes not just accuracy, but also the ability to explain its reasoning. Their system combines three main components:
1. Fine-tuning Multimodal Models
The team started by training open-source AI models, like those from the Qwen, Gemma, and LLaMA families, specifically on a large dataset of dermatology cases. This helps the models understand the unique visual and textual information related to skin conditions.
2. Structured Reasoning Layer
This component acts like a senior dermatologist reviewing multiple opinions. Instead of simply averaging predictions from different models, it evaluates the quality of evidence, considers the clinical context, and applies specialized medical knowledge to reach a final conclusion. It can even override a majority prediction if the evidence suggests otherwise, mimicking how experienced doctors make complex decisions.
Also Read:
- The ‘Evaluator Effect’: How Human and AI Judges See Clinical Plans Differently
- PathCoT: Enhancing AI’s Understanding of Pathology Images with Expert Reasoning
3. Agentic Retrieval-Augmented Generation (RAG)
This advanced system emulates how dermatologists consult medical references. It uses specialized “agents” that work together to integrate patient images, symptom descriptions, and external medical knowledge from databases like the American Academy of Dermatology. For example, if a diagnosis is suspected, the system can dynamically search for relevant information about that condition’s common locations or symptoms, providing a richer, more grounded explanation for its diagnosis. This helps fill in gaps in patient information and makes the AI’s answers more like a well-informed clinical explanation.
The team participated in the 2025 ImageCLEF MEDIQA-MAGIC challenge, which focuses on multimodal dermatology questions. Their submission secured second place, demonstrating strong performance and high accuracy. The reasoning layer, which combines predictions from multiple models, showed particularly consistent and high accuracy, especially on unseen test cases, outperforming individual models significantly. While the agentic RAG system had similar accuracy to the reasoning layer, its key contribution was providing richer, context-aware explanations by integrating external medical knowledge. This makes the AI’s decisions more transparent and trustworthy for clinicians.
This research addresses a crucial need in telemedicine: making accurate diagnostic decisions with limited input while ensuring high accuracy and interpretability. By mimicking the systematic reasoning patterns of dermatologists, this architecture paves the way for more reliable automated diagnostic support systems. The paper highlights that interpretability doesn’t have to come at the cost of accuracy, offering a promising path for trustworthy AI in healthcare. You can read the full research paper here.


