Generative AI Models Show Variable Performance in Pulmonary CT Lung Cancer Diagnosis

TLDR: A recent study evaluated the diagnostic capabilities of advanced generative AI models, including GPT-4-turbo, Gemini-pro-vision, and Claude-3-opus, in interpreting pulmonary CT scans for lung cancer. While these models demonstrated potential, particularly with single-image inputs, their accuracy declined when presented with more complex multimodal information, such as multiple CT slices or patient clinical histories. The research highlights both the promise and the current limitations of AI in complex radiological interpretations, emphasizing the need for further refinement for successful clinical integration.

A comprehensive study has shed light on the performance of cutting-edge generative artificial intelligence (Gen-AI) models in the critical field of pulmonary computed tomography (CT) imaging for lung cancer diagnosis. The research, published on June 29, 2025, evaluated three prominent Gen-AI models: GPT-4-turbo, Gemini-pro-vision, and Claude-3-opus. The objective was to assess their diagnostic accuracy and identify their strengths and weaknesses when interpreting complex radiological data.

The study, a retrospective analysis, utilized chest CT scans from 404 patients, including those with lung neoplasms (184 cases) and non-malignant lung conditions (210 cases). External validation was performed using datasets from The Cancer Genome Atlas and the Medical Imaging and Data Resource Center.

The models were tested across various clinical scenarios, including single-image CT diagnostics, consecutive CT slices, and single images combined with patient clinical histories.

Initial findings revealed that in single-image CT diagnostics, Gemini and Claude demonstrated superior accuracy compared to GPT. However, a significant observation was the decline in diagnostic accuracy for all models when additional CT slices or clinical histories were incorporated. This suggests a challenge in integrating complex multimodal information effectively. For instance, Gemini’s accuracy dropped sharply with consecutive slices, indicating potential difficulties in interpreting lesion continuity and spatial relationships. Similarly, GPT struggled with tasks combining CT images and clinical history, often treating auxiliary text as interference.

Further analysis indicated that Gen-AI models primarily relied on morphology and margins for malignancy predictions. While features like “spiculated” and “irregular” margins, as well as “mixed,” “solid,” and “hyperdense” densities, were heavily weighted, the models occasionally struggled to recognize critical imaging features and, concerningly, sometimes fabricated data. This “hallucination” of information poses a significant risk in clinical applications, potentially misleading diagnoses.

The research also explored the impact of prompt design on model performance. Simplifying prompts, which asked only for lesion identification and preliminary diagnosis, led to significant improvements in diagnostic accuracy, sensitivity, specificity, and F1 scores across all models. This suggests that the way information is presented to these AI systems can profoundly influence their diagnostic capabilities.

Also Read:

Despite the promising aspects, the study underscores the current limitations of Gen-AI in medical imaging. These include inconsistencies in diagnostic justifications, discrepancies between AI-generated parameters and actual image features, and a tendency for performance degradation with increasing information complexity. The authors emphasize that while Gen-AI holds potential for early tumor screening and streamlining diagnostic workflows, ongoing efforts are crucial to improve their robustness, reliability, and ability to integrate diverse clinical information for successful adoption in healthcare. The findings highlight the need for developers to maintain objective perspectives when describing their models’ performance in practical applications and specific tasks, and for continued research to bridge the gap in domain expertise and address issues like data fabrication.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Generative AI Models Show Variable Performance in Pulmonary CT Lung Cancer Diagnosis

Gen AI News and Updates

Fair Work Commission to Address AI’s Role in Dismissal Claims Amid Rising Applications

Indian Supreme Court Addresses AI Misuse and Deepfake Threats in Judiciary, Seeks Regulation

Ministry of Science and ICT Honors Students for Pioneering AI Insights in National Contest

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

OneShield Achieves Landmark Registration Under Cloud Security Alliance AI Controls Matrix, Setting New Industry Standard

SeedAI Leads Utah’s Proactive Initiative for Ethical AI Integration in Business

Bahrain Commended for AI Preparedness in New UNESCO Global Report

U.S. Air Force Secures Skydio Drone Technology for Enhanced Autonomous Operations

Malaysia Forges Ahead with AI Development, Prioritizing Governance and Ethical Frameworks

Contractify Honored as Top Contract Management Solution Provider for 2025 by LegalTech Breakthrough Awards

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Astreya Unveils New Wave of Enterprise AI Agents to Boost Business Efficiency and Automation

EPAM Honored with Microsoft’s 2025 Innovate with Azure AI Platform Partner of the Year Award for Pioneering AI Solutions

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Netherlands Unveils Ambitious AI Strategy to Shape Global Governance Frameworks

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Prepify AI and ZoraSafe, Inc. Honored with ‘Panelists’ Choice’ Awards at UF Innovate’s GatorPitch in Miami

Subscribe to get the latest news and updates