TLDR: A systematic review of 48 studies reveals the rapid growth and promising early results of machine learning models integrating pathology images and omic data for cancer survival prediction. While these multimodal models generally outperform unimodal approaches, the review highlights significant methodological biases, inconsistent reporting, heavy reliance on a single dataset (TCGA), and a lack of clinical utility evaluation, indicating the field’s immaturity and the need for more robust research practices before clinical translation.
A recent systematic review delves into the rapidly expanding field of machine learning models that combine pathology images and high-throughput omic data to predict overall survival in cancer patients. This area of research holds significant promise for improving cancer prognostication, which is crucial for guiding treatment decisions, designing clinical trials, and planning healthcare resources.
The review, conducted by a team including Charlotte Jennings and Darren Treanor, aimed to clarify the methodological quality, reporting standards, and clinical relevance of these multimodal models. They performed a systematic search across major databases like EMBASE, PubMed, and Cochrane CENTRAL, identifying 48 eligible studies published since 2017. All these studies utilized The Cancer Genome Atlas (TCGA) dataset, a large public repository of cancer data.
The studies covered survival prediction for cancers across 19 different organs, with brain, breast, lung, and kidney cancers being the most frequently studied. The types of data integrated alongside whole slide images (WSI) included gene expression (mRNA), somatic mutation data, micro-RNA, copy number variation (CNV), single nucleotide variation (SNV), DNA methylation, and protein expression. Clinical data, such as age, gender, and cancer stage, were also incorporated into some models.
The modeling approaches varied, ranging from regularized Cox regression methods to classical machine learning and deep learning techniques. Deep learning models predominated, especially since 2019, with many using Cox-based loss functions for survival prediction. Most models employed feature-level fusion, combining data from different modalities into a single representation, often using attention-based mechanisms to identify important information. A few studies explored decision-level fusion, where predictions from separate unimodal models are combined later.
A notable finding was that multimodal models generally outperformed simpler unimodal models (those using only one type of data, like images or omics) in all but one study where comparisons were available. The performance, measured by the concordance index (c-index), ranged from 0.550 to 0.857. While promising, the extent of improvement varied significantly. The review also highlighted that models with 400 or more participants tended to achieve optimal results on internal test sets.
Despite the rapid growth and promising early results, the review identified significant limitations. All included studies were judged to be at unclear or high overall risk of bias due to inconsistent reporting and limited external validation. Common issues included a lack of detailed information about the TCGA datasets, poor presentation of participant characteristics, and insufficient discussion of data acquisition processes. Only a handful of studies evaluated model calibration, which assesses how well a model’s predicted probabilities match observed outcomes, and clinical utility, such as through decision curve analysis.
The heavy reliance on the TCGA dataset across all studies raises concerns about potential overfitting to this single data source. The authors emphasize the need for more diverse datasets from varied populations and technical sources to ensure models are robust and generalizable for real-world clinical application. They also recommend greater focus on robust reporting, using guidelines like TRIPOD+AI and PROBAST+AI, and evaluating the real-world clinical utility and cost-benefit of these complex models, especially given the expense of generating high-throughput omic data not yet routine in clinical workflows.
Also Read:
- SmartPath-R1: A New AI System for Comprehensive Pathology Analysis
- MiDeSeC: A New Dataset to Advance Automated Mitosis Detection in Breast Cancer
In conclusion, while machine learning-based multimodal models for cancer survival prediction show significant potential, the field is still in its early stages. Future progress hinges on addressing methodological biases, diversifying data sources, improving reporting standards, and demonstrating clear clinical value. For more details, you can refer to the full research paper here.


