TLDR: A new study introduces FetalCLIPCLS and FetalCLIPSEG, AI models based on the FetalCLIP foundation model, to automatically assess the quality of fetal ultrasound images. These models, particularly FetalCLIPSEG, show improved performance in identifying high-quality images, which is crucial for accurate fetal measurements in low-resource settings where skilled sonographers are scarce. The research highlights the effectiveness of domain-specific foundation models and parameter-efficient fine-tuning for advancing prenatal care.
Accurate fetal biometric measurements, such as abdominal circumference, are crucial for monitoring fetal growth and high-risk pregnancies during prenatal care. However, obtaining high-quality ultrasound images for these measurements often requires highly skilled sonographers. This presents a significant challenge in low-income countries where trained personnel are scarce. In such settings, novice operators often perform ‘blind-sweep’ ultrasounds using low-cost portable probes, which typically yield lower quality data that may lack the precise anatomical planes needed for accurate assessment.
To address this critical issue, researchers have developed innovative artificial intelligence (AI) models for automated fetal ultrasound image quality assessment (IQA). The goal is to help less-experienced sonographers identify high-quality frames suitable for measurement, thereby improving prenatal care in resource-limited environments.
Introducing FetalCLIPCLS and FetalCLIPSEG
The new research introduces two key models: FetalCLIPCLS and FetalCLIPSEG. These models are built upon FetalCLIP, a powerful vision-language foundation model specifically trained on a vast dataset of over 210,000 fetal ultrasound image-caption pairs. Think of FetalCLIP as a highly specialized AI that understands fetal ultrasound images and their descriptions.
FetalCLIPCLS is an IQA model adapted from FetalCLIP using a technique called Low-Rank Adaptation (LoRA). LoRA is a clever method that allows for efficient fine-tuning of large AI models with minimal adjustments, making it ideal for deployment in settings with limited computational resources. This model is designed to classify whether an ultrasound frame is optimal for fetal biometric measurement.
Going a step further, the researchers also proposed FetalCLIPSEG. This model repurposes a segmentation model for the classification task. A segmentation model identifies and outlines specific structures within an image. The idea is that if a model can accurately segment the fetal abdominal region, it can also determine if the image is of high quality for measurement. FetalCLIPSEG uses a thresholding strategy to convert its segmentation outputs into a binary classification (good or bad quality).
Experimental Validation and Key Findings
The models were rigorously evaluated on the ACOUSLIC-AI dataset, which comprises fetal abdominal ultrasound scans collected from pregnant women in Sierra Leone and Tanzania. These scans were acquired by novice users with minimal training, reflecting real-world low-resource conditions. The dataset is particularly challenging due to a high imbalance, with only a small percentage of frames containing clear abdominal structures.
The results are promising. FetalCLIPCLS consistently outperformed six other baseline models, including well-known CNN-based (DenseNet, EfficientNet, VGG) and Transformer-based (Swin Transformer, DeiT, Vision Transformer) models, achieving the highest F1 score of 0.757. The F1 score is a measure that balances precision and recall, indicating the model’s overall accuracy in identifying optimal frames.
Even more impressively, FetalCLIPSEG, the adapted segmentation model, further improved performance, achieving an F1 score of 0.771. While FetalCLIPSEG showed a slight reduction in precision (meaning it might classify some suboptimal frames as positive), its higher recall indicates it is very effective at identifying most of the truly optimal frames. This demonstrates the feasibility and effectiveness of using a segmentation model for image quality assessment.
A significant finding was that FetalCLIPCLS, despite being trained on a much smaller, domain-specific dataset (210,000 pairs), outperformed a Vision Transformer model pretrained on a generic dataset 1,900 times larger (400 million image-text pairs). This highlights the crucial importance of domain-specific pretraining for foundation models in medical applications.
Also Read:
- AI-Powered Lung Sound Analysis for Children’s Respiratory Health
- Pancreatic Cyst Analysis Enhanced by Fine-Tuned AI Models
Impact on Prenatal Care
This research demonstrates how parameter-efficient fine-tuning of fetal ultrasound foundation models can enable task-specific adaptations, significantly advancing prenatal care in resource-limited settings. By providing automated image quality assessment, these models can assist less-experienced sonographers in collecting more accurate data, leading to better assessments of fetal growth and improved monitoring of high-risk pregnancies.
The work underscores that ultrasound-specific foundation models can enhance diagnostic accuracy while remaining computationally efficient, making them suitable for deployment in environments with limited resources. For more details, you can refer to the full research paper: Advancing Fetal Ultrasound Image Quality Assessment in Low-Resource Settings.


