Investigating Synthetic Image Augmentation for Atypical Cell Detection in Histopathology

TLDR: A study on the MIDOG 2025 competition found that both ImageNet-pretrained ConvNeXt-Small and histopathology-pretrained Lunit ViT models achieved high performance (AUROC ≈95%) in classifying atypical mitotic figures. Surprisingly, adding synthetic atypical images to balance the dataset did not consistently improve results, suggesting that while both backbones are viable, naive synthetic balancing has limited benefit and domain-pretrained models offer more stability.

A recent study delves into the effectiveness of using synthetic image augmentation to tackle imbalanced classification problems, specifically focusing on the MIDOG 2025 Atypical Cell Detection Competition. This competition addresses a critical challenge in histopathology: distinguishing between normal and atypical mitotic figures in medical images, a task complicated by significant class imbalance and variations across different medical domains.

The researchers, including Leire Benito-Del-Valle and Adrian Galdran, investigated two main approaches. They explored the use of two distinct neural network backbones: ConvNeXt-Small, which was initially trained on the vast ImageNet dataset, and a histopathology-specific Vision Transformer (ViT) developed by Lunit, trained using self-supervision. A key aspect of their work was to see if generating additional synthetic atypical cell examples could help balance the dataset, which originally had a much higher number of normal cells (9,408) compared to atypical ones (1,741).

The study employed a five-fold cross-validation strategy to evaluate the models. Both backbones demonstrated strong performance, achieving an average AUROC (Area Under the Receiver Operating Characteristic curve) of approximately 95%. ConvNeXt-Small showed slightly higher peak performance, while the Lunit ViT exhibited greater stability across different folds of the data. However, a surprising finding was that the synthetic balancing of the dataset did not consistently lead to improved results. In fact, models trained only on real data often performed marginally better or on par with those trained on a combination of real and synthetic data.

Further evaluation on a preliminary hidden test set, designed to be out-of-distribution, reinforced these observations. ConvNeXt-Small achieved the highest AUROC of 95.4% on this set, with Lunit remaining competitive in terms of balanced accuracy. The authors suggest that both ImageNet-pretrained and domain-pretrained backbones are viable for atypical mitosis classification. Domain-specific pretraining appears to offer robustness, while ImageNet pretraining can lead to higher peak performance. The limited benefit of naive synthetic balancing indicates that while it can address severe imbalance, careful consideration is needed to avoid introducing artifacts that might hinder generalization.

The methodology involved a two-stage training strategy for synthetic image generation using a latent diffusion model, specifically a Diffusion Transformer (DiT) architecture. This model was first pretrained on large-scale canine histopathology datasets to capture general mitotic figure features, then fine-tuned using the MIDOG 2025 dataset with class labels to enable class-aware synthesis. For the classification models, various implementation details were carefully considered, including input size, preprocessing, data augmentations, and optimization protocols.

Also Read:

The findings from this research provide valuable insights for future work in medical image analysis, particularly for tasks involving rare disease detection or imbalanced datasets. It highlights the potential of different pretraining strategies and underscores the complexities of effectively using synthetic data for augmentation. For a deeper dive into the methodology and results, you can read the full paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Investigating Synthetic Image Augmentation for Atypical Cell Detection in Histopathology

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates