spot_img
HomeResearch & DevelopmentRETFound Foundation Model Adapted for Precise Optic Disc Segmentation

RETFound Foundation Model Adapted for Precise Optic Disc Segmentation

TLDR: This paper presents the first adaptation of RETFound, a pre-trained foundation model for retinal images, to the task of optic disc segmentation. By combining RETFound’s encoder with Segmenter’s decoder, the new system achieves superior or comparable performance to state-of-the-art methods across multiple datasets, even with significantly smaller task-specific training sets. The research highlights the benefits of foundation models for medical image analysis, particularly in generalization and efficient use of data.

A new study explores the potential of the RETFound foundation model, originally designed for classifying retinal and systemic diseases from eye images, for a different crucial task: optic disc segmentation. This marks the first known adaptation of RETFound for segmenting anatomical structures in retinal images, a foundational step in analyzing eye health.

The optic disc, a visible structure in the retina, is a vital source of biomarkers for various systemic diseases. Accurately segmenting this structure is essential for research into these biomarkers. Traditionally, supervised Convolutional Neural Network (CNN) models have been used for retinal image segmentation. However, these models face several challenges: the high cost and labor involved in expert annotation of medical images, a rapid decrease in performance when tested on datasets different from their training data, and a reliance on data augmentation techniques that sometimes involve synthetic images, which clinicians may trust less.

Foundation models (FMs) like RETFound offer a promising solution to these limitations. FMs are trained on vast amounts of unlabelled data using self-supervised learning to create a rich, latent representation of the image domain. This representation can then be adapted for specific downstream tasks with only a limited amount of labelled data. RETFound, for instance, was trained on over 900,000 fundus camera images and 700,000 optical coherence tomography (OCT) images, all real retinal images with only basic geometric augmentations, making it more acceptable to clinicians.

In this research, the team adapted RETFound’s encoder, which is based on a large vision Transformer, and combined it with the decoder from Segmenter, another advanced segmentation model. The weights of RETFound’s encoder were frozen, allowing it to extract high-level features from any retinal dataset. The Segmenter decoder was then trained to generate segmentation maps for the optic disc.

The researchers conducted three main experiments to evaluate their new system: internal verification, domain generalization, and domain adaptation. They used a combination of public datasets (IDRID, Drishti-GS, RIM-ONE-r3, REFUGE) and a private dataset (GoDARTS). For loss calculation, they found that a combination of Dice loss and Binary Cross Entropy loss (BCELoss) led to better performance and faster convergence compared to using cross-entropy loss alone, especially given the imbalance between the small optic disc area and the large background.

The results were highly encouraging. In internal verification, the RETFound-based system achieved Dice scores of around 96% consistently across all datasets, outperforming or matching state-of-the-art segmentation-specific baseline networks. This was achieved even with a very modest number of task-specific training examples. For domain generalization, where the model is trained on multiple datasets and tested on an unseen one, the new method consistently outperformed existing state-of-the-art task-specific baselines.

In domain adaptation experiments, where the model is trained on one dataset and tested on others, the system also showed strong generalization capabilities, even when the source and target domains were significantly different, and the training image count was much smaller than the testing image count. The study also noted that basic spatial data augmentations (like random rotation and flipping) were more effective for fine-tuning RETFound for optic disc segmentation than more complex augmentation strategies.

Also Read:

The paper highlights that foundation models like RETFound can provide excellent performance in medical image analysis, offering improved generalization and efficient use of unlabelled data. This work paves the way for future research into other critical tasks in retinal image analysis, such as vessel segmentation and the discovery of new biomarkers for systemic conditions. For more details, you can refer to the full research paper here.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -