Unpacking the Internal Maps of AI in Pathology: A Look at Foundation Model Representations

TLDR: A study used Representational Similarity Analysis to compare six computational pathology foundation models, finding that models with the same training paradigm don’t always have similar internal representations. All models showed high slide-dependence but low disease-dependence, with stain normalization reducing slide-dependence. Vision-language models had more compact representations, while vision-only models were more distributed. These insights can improve model robustness and inform ensembling strategies.

The field of computational pathology (CPath) is rapidly advancing with the development of “foundation models.” These powerful AI models are designed to learn from vast datasets and then apply that knowledge to various tasks, such as identifying tumor types or predicting disease progression. While many studies have focused on how well these models perform on specific tasks, less is known about the underlying structure of the information they learn and how similar or different these structures are across various models.

A recent study delves into this very question, systematically analyzing the “representational spaces” of six prominent CPath foundation models. Think of representational space as the internal map a model creates to understand and categorize the complex visual information from tissue slides. The researchers used a technique called Representational Similarity Analysis (RSA), which is commonly used in computational neuroscience to compare how different parts of the brain process information.

The study examined models that use two main learning strategies: vision-language contrastive learning (like CONCH, PLIP, and KEEP), which learns by associating images with text descriptions, and self-distillation (like UNI v2, Virchow v2, and Prov-GigaPath), which learns by refining its own understanding of visual data. They used H&E stained image patches from The Cancer Genome Atlas (TCGA) to conduct their analysis.

One of the key findings was that UNI2 and Virchow2, both vision-only models, had the most distinct internal representations. Surprisingly, simply having the same training approach (e.g., both being vision-only or both vision-language) didn’t guarantee that models would have similar internal structures. For instance, Prov-GigaPath, a vision-only model, showed the highest average similarity across all models, even with vision-language ones.

The research also highlighted a significant “slide-dependence” in all models’ representations. This means that the models’ internal maps were heavily influenced by individual tissue slides, rather than just the disease type. While this might be useful for some tasks, it also suggests a potential lack of robustness to variations between slides, such as those caused by different hospitals or staining protocols. Interestingly, applying a technique called “stain normalization” (which standardizes the appearance of tissue stains) significantly reduced this slide-dependence, improving robustness.

Conversely, the models showed relatively low “disease-dependence.” This might seem counterintuitive given their strong performance in classifying tumor types. However, the researchers suggest that while the overall representations might vary, specific combinations of features crucial for disease classification could remain stable.

When looking at the “intrinsic dimensionality” of the representations, vision-language models tended to have more compact, lower-dimensional representations. This could be because the language component acts as a “bottleneck,” encouraging the model to compress visual information into a more concise form. Vision-only models, on the other hand, had more distributed, higher-dimensional representations, potentially preserving richer visual details. This difference in dimensionality might also contribute to the generally higher performance observed in vision-only models, though their larger training datasets could also play a role.

The implications of these findings are significant for the future of computational pathology. The high slide-specificity points to a need for models that are more robust to variations in data. Techniques like data augmentation or adversarial learning during training, and stain normalization during inference, could help address this. Understanding the similarities and differences between models can also guide “ensembling strategies,” where combining different models can improve performance. Instead of combining many similar models, focusing on more dissimilar, complementary ones could be more effective.

Also Read:

This study provides a valuable framework for understanding the internal workings of CPath foundation models, moving beyond just performance metrics. By probing these internal representations, researchers can develop more effective and reliable AI tools for clinical settings. You can read the full paper here: Comparing Computational Pathology Foundation Models using Representational Similarity Analysis.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unpacking the Internal Maps of AI in Pathology: A Look at Foundation Model Representations

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates