AI's Unexpected Leap in Visualizing Concepts

TLDR: A study found that advanced Large Language Models (LLMs) like GPT-5 and the o3 family can perform complex mental imagery tasks, traditionally thought to require visual imagination, significantly better than humans. This suggests LLMs use propositional reasoning, challenging long-held beliefs about mental imagery and offering new benchmarks for AI cognitive abilities. Surprisingly, image-aided reasoning did not improve LLM performance.

A groundbreaking study has revealed that advanced Large Language Models (LLMs) are capable of performing complex mental imagery tasks, traditionally believed to require visual imagination, at a level significantly surpassing human performance. This finding challenges long-standing theories in cognitive psychology and opens new avenues for understanding artificial intelligence’s emergent cognitive capacities.

For decades, cognitive psychologists have debated the nature of mental imagery, with the dominant view suggesting it’s a pictorial process—meaning we ‘see’ images in our minds. A classic task used to support this view involves following a series of instructions to transform imagined letters and shapes into a final object, which subjects then identify. Success in this task was thought to be impossible without visual mental imagery.

However, the new research, titled ARTIFICIALPHANTASIA: EVIDENCE FOR PROPOSITIONAL REASONING-BASED MENTAL IMAGERY IN LARGE LANGUAGE MODELS, put this hypothesis to the test using state-of-the-art LLMs. Given that LLMs are primarily text-based and lack a ‘visual’ system in the human sense, they are ideal candidates to explore whether language alone could be sufficient for such tasks.

The researchers, Morgan McCarty and Jorge Morales from Northeastern University, designed 60 novel instruction sets, alongside 12 from the original Finke et al. study, for an object reconstruction task. They then tested several leading LLMs, including models from OpenAI (o3 family, GPT-5), Claude, and Gemini, with these text-only instructions. To establish a baseline, 100 human subjects also completed the same task.

The results were striking: the best LLMs, specifically GPT-5 and the o3 family of models, performed significantly above the average human performance, showing a 9.4% to 12.2% increase over the human average of 54.7%. This suggests that these AI models can effectively ‘imagine’ and manipulate objects based purely on linguistic descriptions.

Interestingly, the study also explored an ‘image-aided’ approach where LLMs were prompted to generate and modify images at each step of the task. Contrary to expectations, this did not improve performance; in fact, it either decreased it or kept it at the same level. This indicates that the LLMs’ success was not due to a simulated visual process but rather an underlying propositional reasoning capability.

The findings also resonate with observations in humans with aphantasia—a condition where individuals lack conscious visual mental imagery. Despite this, aphantasics can often perform mental imagery tasks surprisingly well, frequently reporting the use of verbal strategies. The LLMs’ success provides further evidence that non-imagistic, propositional reasoning might be sufficient for tasks long thought to be imagery-dependent, reigniting a significant debate in cognitive science about the fundamental nature of mental representations.

Also Read:

This research not only demonstrates an emergent cognitive capacity in LLMs but also provides the field with a new, robust benchmark for evaluating sophisticated cognitive behaviors in artificial systems. It suggests that the most advanced LLMs may be capable of extracting and manipulating spatial relations from textual information, offering a fresh perspective on how both human and artificial minds process complex information.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI’s Unexpected Leap in Visualizing Concepts

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates