RETFound Foundation Model Adapted for Precise Optic Disc Segmentation

TLDR: This paper presents the first adaptation of RETFound, a pre-trained foundation model for retinal images, to the task of optic disc segmentation. By combining RETFound’s encoder with Segmenter’s decoder, the new system achieves superior or comparable performance to state-of-the-art methods across multiple datasets, even with significantly smaller task-specific training sets. The research highlights the benefits of foundation models for medical image analysis, particularly in generalization and efficient use of data.

A new study explores the potential of the RETFound foundation model, originally designed for classifying retinal and systemic diseases from eye images, for a different crucial task: optic disc segmentation. This marks the first known adaptation of RETFound for segmenting anatomical structures in retinal images, a foundational step in analyzing eye health.

The optic disc, a visible structure in the retina, is a vital source of biomarkers for various systemic diseases. Accurately segmenting this structure is essential for research into these biomarkers. Traditionally, supervised Convolutional Neural Network (CNN) models have been used for retinal image segmentation. However, these models face several challenges: the high cost and labor involved in expert annotation of medical images, a rapid decrease in performance when tested on datasets different from their training data, and a reliance on data augmentation techniques that sometimes involve synthetic images, which clinicians may trust less.

Foundation models (FMs) like RETFound offer a promising solution to these limitations. FMs are trained on vast amounts of unlabelled data using self-supervised learning to create a rich, latent representation of the image domain. This representation can then be adapted for specific downstream tasks with only a limited amount of labelled data. RETFound, for instance, was trained on over 900,000 fundus camera images and 700,000 optical coherence tomography (OCT) images, all real retinal images with only basic geometric augmentations, making it more acceptable to clinicians.

In this research, the team adapted RETFound’s encoder, which is based on a large vision Transformer, and combined it with the decoder from Segmenter, another advanced segmentation model. The weights of RETFound’s encoder were frozen, allowing it to extract high-level features from any retinal dataset. The Segmenter decoder was then trained to generate segmentation maps for the optic disc.

The researchers conducted three main experiments to evaluate their new system: internal verification, domain generalization, and domain adaptation. They used a combination of public datasets (IDRID, Drishti-GS, RIM-ONE-r3, REFUGE) and a private dataset (GoDARTS). For loss calculation, they found that a combination of Dice loss and Binary Cross Entropy loss (BCELoss) led to better performance and faster convergence compared to using cross-entropy loss alone, especially given the imbalance between the small optic disc area and the large background.

The results were highly encouraging. In internal verification, the RETFound-based system achieved Dice scores of around 96% consistently across all datasets, outperforming or matching state-of-the-art segmentation-specific baseline networks. This was achieved even with a very modest number of task-specific training examples. For domain generalization, where the model is trained on multiple datasets and tested on an unseen one, the new method consistently outperformed existing state-of-the-art task-specific baselines.

In domain adaptation experiments, where the model is trained on one dataset and tested on others, the system also showed strong generalization capabilities, even when the source and target domains were significantly different, and the training image count was much smaller than the testing image count. The study also noted that basic spatial data augmentations (like random rotation and flipping) were more effective for fine-tuning RETFound for optic disc segmentation than more complex augmentation strategies.

Also Read:

The paper highlights that foundation models like RETFound can provide excellent performance in medical image analysis, offering improved generalization and efficient use of unlabelled data. This work paves the way for future research into other critical tasks in retinal image analysis, such as vessel segmentation and the discovery of new biomarkers for systemic conditions. For more details, you can refer to the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

RETFound Foundation Model Adapted for Precise Optic Disc Segmentation

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates