spot_img
HomeResearch & DevelopmentGeometry-Guided AI Enhances Multi-View Mammography Analysis

Geometry-Guided AI Enhances Multi-View Mammography Analysis

TLDR: The research introduces GLAM, a novel visual language model for mammography that uses geometry-guided local alignment to better understand multi-view breast images. Unlike previous models that often ignore the crucial relationship between different mammogram views, GLAM learns fine-grained cross-view correspondences by aligning patches from one view to slices in the other, mimicking how radiologists interpret images. Pre-trained on a large dataset, GLAM significantly outperforms existing methods in breast cancer detection, density prediction, and BI-RADS classification across various datasets, demonstrating improved accuracy and generalization by leveraging the inherent geometry of mammography.

Mammography screening is a vital tool for the early detection of breast cancer. Deep learning methods hold significant promise for improving the speed and accuracy of mammography interpretation. However, developing powerful visual language models (VLMs) for this domain faces challenges due to limited medical data and inherent differences between natural and medical images.

Existing mammography VLMs, often adapted from models designed for natural images, frequently overlook crucial domain-specific characteristics. A prime example is the multi-view nature of mammography. Standard protocols produce two 2D images of the same 3D breast from different angles: craniocaudal (CC) and mediolateral oblique (MLO). Radiologists meticulously analyze both views together to understand ipsilateral correspondence, which is essential for accurately locating regions of interest like tumors and mitigating ambiguities caused by projection angles. Current deep learning methods often treat these views as independent images or fail to properly model their multi-view correspondence, leading to a loss of critical geometric context and suboptimal predictions.

Introducing GLAM: Geometry-Guided Local Alignment for Multi-View Mammography

Researchers from Yale University have proposed a novel approach called GLAM: Global and Local Alignment for Multi-view mammography. This model is designed for visual language pre-training and leverages geometry guidance to address the shortcomings of previous methods. By incorporating prior knowledge about the multi-view imaging process of mammograms, GLAM learns local cross-view alignments and fine-grained local features through a combination of joint global and local, visual-visual, and visual-language contrastive learning.

The core idea behind GLAM is to mimic how radiologists interpret mammograms by considering the geometric relationship between the CC and MLO views. The model is pre-trained on EMBED, one of the largest open mammography datasets, and has demonstrated superior performance compared to existing baselines across multiple datasets and settings.

How GLAM Works

The GLAM model involves several key steps to achieve its robust performance:

Pre-processing: Before feeding images into the model, mammograms undergo specific pre-processing steps. This includes removing the pectoral region from MLO views and rotating images to better align the CC and MLO views along the anterior-posterior (AP) axis. Random affine transformations are also applied to make the model more resilient to minor misalignments. Additionally, radiology reports are synthesized from tabular data and augmented to create diverse textual supervision signals.

Global Multi-view Visual Language Pre-training: At a global level, GLAM extracts visual features from both CC and MLO views and textual features from the radiology report. It then optimizes a multi-view contrastive loss, ensuring that features from both views of the same breast are aligned. Symmetrically, it also aligns image features from each view with the corresponding text features, allowing the model to learn high-level semantic information from the reports.

Geometry-Guided Local Alignment: This is where GLAM truly innovates. Instead of just global alignment, the model performs local alignment using patch features. It aggregates these raw patch features into “super-patches” with larger receptive fields, capturing higher-level semantic information. The crucial part is the “patch-to-slice” alignment along the AP axis. Based on the known geometry of mammography, image slices from both views at the same AP position represent the same 3D breast tissue. Therefore, a patch in one view is aligned with an entire slice in the other view using a multi-head cross-attention mechanism. This ensures that the model learns fine-grained positional relationships and semantic correspondence across views, respecting the actual 3D breast structure.

To further enhance local positional awareness, GLAM uses negative samples not only from different positions within the same patient but also from the same position across different patients in the batch. This forces the model to focus on the actual patch features rather than just positional encoding.

Also Read:

Performance and Impact

GLAM was evaluated on three diverse datasets: EMBED (in-domain), VinDr, and RSNA-Mammo (out-of-domain). It consistently outperformed all baselines in various tasks, including BI-RADS prediction, density prediction, and cancer prediction, across zero-shot, linear probing, and full fine-tune settings. Notably, GLAM showed significant improvements in multi-view prediction tasks, demonstrating its ability to effectively model multi-view geometry and extract complementary features from each view.

The research highlights that ignoring either view in mammography can lead to diagnostic errors, especially in deep-learning models that lack prior knowledge of the imaging process. GLAM’s geometry-guided local alignment module provides this crucial fine-grained cross-view awareness, making it one of the largest and most robust screening mammography foundation CLIP models to date. For more technical details, you can refer to the full research paper here.

This work represents a significant step forward in developing more accurate and reliable AI tools for mammography interpretation, potentially leading to earlier and more precise breast cancer detection.

Nikhil Patel
Nikhil Patelhttps://blogs.edgentiq.com
Nikhil Patel is a tech analyst and AI news reporter who brings a practitioner's perspective to every article. With prior experience working at an AI startup, he decodes the business mechanics behind product innovations, funding trends, and partnerships in the GenAI space. Nikhil's insights are sharp, forward-looking, and trusted by insiders and newcomers alike. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -