New AI Tool Detects Unrealistic Shapes in Synthetic Medical Images

TLDR: This research introduces a novel knowledge-based anomaly detection method for identifying unrealistic shape artifacts in synthetic medical images, particularly mammograms. The two-stage framework uses a feature extractor to analyze angle gradients along anatomical boundaries and an isolation forest to detect anomalies. Tested on synthetic mammography datasets, the method effectively concentrated artifacts in highly anomalous partitions and showed strong agreement with human experts. This tool is crucial for ensuring the quality and reliability of AI-generated medical data, offering a model-agnostic and interpretable way to evaluate synthetic images for anatomical accuracy.

The rise of artificial intelligence in medical imaging has brought about a promising solution to data scarcity: synthetic data. This artificial data, designed to mimic real patient information, can help train powerful machine learning models. However, a critical challenge remains: ensuring the quality and realism of these synthetic images. Without proper assessment, they can introduce artifacts, distortions, and unrealistic features that compromise the performance and clinical utility of AI models.

A recent research paper, titled Knowledge-based anomaly detection for identifying network-induced shape artifacts, introduces a novel method to tackle this very problem. Authored by Rucha Deshpande, Tahsin Rahman, Miguel Lago, Adarsh Subbaswamy, Jana G. Delfino, Ghada Zamzmi, Elim Thompson, Aldo Badano, and Seyed Kahaki, this work focuses on detecting ‘network-induced shape artifacts’ in synthetic medical images, specifically mammograms.

Understanding the Problem with Synthetic Data

Generative AI models, while powerful, can sometimes prioritize overall data distributions over fine-grained, image-level details or adherence to clinical and anatomical constraints. This can lead to unrealistic features, such as unnatural geometric patterns, synthetic noise within breast tissue, or artificial discontinuities along tissue boundaries. These artifacts can introduce systematic biases, potentially affecting the diagnostic accuracy of AI models trained on such data.

Current evaluation methods often rely on dataset-wide metrics, which are useful for general trends but can miss localized artifacts present in only a fraction of the dataset. Given the vast number of images in synthetic datasets, visual inspection of every image for subtle flaws is impractical and inefficient.

A Novel Two-Stage Approach

The researchers propose a knowledge-based anomaly detection method that operates in two stages:

Feature Extractor: This stage constructs a specialized feature space by analyzing the per-image distribution of angle gradients along anatomical boundaries. Essentially, it looks at how the shape of an anatomical structure (like a breast) changes along its edges. This representation is designed to capture local shape variations while maintaining overall anatomical correspondence, regardless of size differences.
Anomaly Detector: An isolation forest algorithm is then used. This algorithm is trained on real patient data to learn what normal anatomical shape characteristics look like. It then assigns ‘anomaly scores’ to synthetic images. Highly negative scores indicate a higher likelihood of containing artifacts, while scores near zero or positive suggest normal images.

A key advantage of this method is its interpretability, stemming from its knowledge-based design. It can identify anatomically unrealistic images irrespective of the generative model used to create them.

Demonstrating Effectiveness

The method was tested on two synthetic mammography datasets generated by different AI architectures: a latent diffusion model and StyleGAN2. The results were compelling:

Quantitative Evaluation: The method successfully concentrated artifacts in the most anomalous partition (the 1st percentile of images), achieving high AUC values of 0.97 and 0.91 for the two datasets.
Human Reader Study: A study involving three imaging scientists confirmed that images flagged by the method as containing network-induced shape artifacts were also identified by human readers. Mean agreement rates were approximately 66% and 68% for the most anomalous partition, which is 1.5 to 2 times higher than for the least anomalous partition. Kendall-Tau correlations between algorithmic and human rankings were also reasonable, at 0.45 and 0.43.
Artifact Identification: The method was able to identify artifacts that were not present in the real patient training data, confirming its ability to detect issues originating from the generative process itself.

Also Read:

Broader Impact and Future Directions

This method represents a significant step forward in the responsible use of synthetic data. It allows developers to evaluate synthetic images against known anatomical constraints, pinpoint specific issues, and improve the overall quality of synthetic datasets efficiently. It can greatly improve the efficiency of visual searches for artifacts, reducing the burden on expert readers.

While demonstrated with breast imaging, the method is designed to be shape and modality agnostic, making it highly versatile. It could be generalized to detect artifacts in other anatomies and imaging modalities, such as lungs in chest radiographs, or organs like the liver and kidneys in abdominal CT scans. The researchers plan to extend their work to other medical imaging modalities and further explore the localization and origin of detected artifacts.

This research provides a crucial tool for quality assurance in the deployment of large-scale synthetic datasets, especially where manual visual inspection of individual images is impractical.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

New AI Tool Detects Unrealistic Shapes in Synthetic Medical Images

Understanding the Problem with Synthetic Data

A Novel Two-Stage Approach

Demonstrating Effectiveness

Broader Impact and Future Directions

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates