How AI Learns to Think Like Humans in Novel Contexts

TLDR: This research introduces the Model Synthesis Architecture (MSA), a computational framework that explains how people reason in novel, ‘open-world’ situations. MSA uses large language models to identify relevant information and probabilistic programs to build tailored mental models on the fly. Evaluated on a sports reasoning dataset, MSA better captures human judgments than traditional language models, especially when dealing with new variables and unexpected scenarios, suggesting a path to more human-like, flexible AI reasoning.

When faced with new and unexpected situations, humans possess a remarkable ability to pull together relevant information from their vast knowledge base and use it to make sense of the world, draw inferences, and predict outcomes. This flexible and coherent way of thinking, known as ‘open-world cognition,’ is a cornerstone of human intelligence. A recent research paper, titled ‘Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models,’ explores a computational approach to understanding and replicating this unique human capability.

The paper, authored by Lionel Wong, Katherine M. Collins, Lance Ying, Cedegao E. Zhang, Adrian Weller, Tobias Gerstenberg, Timothy O’Donnell, Alexander K. Lew, Jacob D. Andreas, Joshua B. Tenenbaum, and Tyler Brooke-Wilson, delves into the long-standing idea in cognitive science that people reason using ‘mental models’ – structured internal representations that mirror aspects of the world. These models help us maintain consistent beliefs, integrate new information, and evaluate different possibilities.

While traditional Bayesian models in cognitive science have successfully explained human judgments in many specific tasks, they often fall short when confronted with truly novel situations. These models are typically designed for a limited scope and cannot easily incorporate new, unforeseen variables or dependencies. This is where the concept of ‘open-world’ reasoning becomes crucial: how do we maintain coherent reasoning in a specific context while drawing on a vast, globally relevant pool of background knowledge?

Introducing the Model Synthesis Architecture (MSA)

The researchers propose a novel computational framework called the Model Synthesis Architecture (MSA) to address this challenge. MSA hypothesizes that human minds construct small, ad-hoc mental models on the fly, tailored to the specific demands of a task. By reasoning within these smaller, custom-built models, MSA can achieve local coherence over the relevant variables, while its ability to synthesize arbitrary models allows it to operate in open-ended environments where relevant considerations are not fixed in advance.

The MSA approach breaks down open-world reasoning into two main subproblems: first, ‘synthesizing’ ad-hoc models that include all relevant variables for a given situation; and second, ‘reasoning within’ that constructed model using general algorithms for belief updating and decision-making.

In their implementation, the team uses large language models (LMs) to handle the ‘global relevance’ aspect – retrieving and organizing relevant background knowledge. For the ‘local coherence’ aspect, they employ probabilistic programming languages (PPLs) to construct bespoke, coherent world models. This combination allows MSA to process arbitrary natural language inputs (thanks to the LM front end) and express arbitrary probabilistic models (due to the general-purpose PPL modeling language).

Evaluating MSA: The ‘Model Olympics’

To evaluate their MSA, the researchers created a novel reasoning dataset called ‘Model Olympics,’ consisting of natural language vignettes about sporting events. This domain was chosen because it naturally integrates intuitive causal reasoning, uncertainty, and diverse latent variables, providing a structured yet open-ended setting for testing flexible cognitive architectures. For instance, in a new sports scenario, one cannot know in advance if injuries, weather, team dynamics, or new equipment will be relevant.

Three experiments were conducted with human participants and compared against the MSA and other baseline models:

Experiment 1 (Detailed Backgrounds): Participants reasoned about situations where all relevant causal relationships were explicitly described in language.
Experiment 2 (Underspecified Backgrounds): Key relationships between variables were only briefly mentioned or implied, requiring participants to retrieve details from their background knowledge.
Experiment 3 (Participant-Generated Novel Details): This experiment introduced new, uncontrolled variables (like ‘sports commentary’ from other naive human subjects, e.g., an athlete pulled a muscle) that needed to be integrated into reasoning, directly testing open-world capabilities.

Also Read:

Key Findings and Implications

Across all experiments, the MSA consistently captured human judgments better than language model-only baselines, including those using chain-of-thought prompting. This suggests several important insights:

Human Reasoning Aligns with Probabilistic Models: People’s judgments are generally consistent with Bayesian inference in ad-hoc probabilistic models, even when those models are synthesized on the fly from natural language.
MSA Outperforms LMs in Generalization: The structured nature of MSA’s probabilistic models allows it to generalize better to arbitrary new details and less familiar settings (like the canoe racing domain, which was less likely to be in LM training data) compared to pure LMs. This difference was most pronounced in Experiment 3, highlighting MSA’s strength in open-world scenarios where new variables and dependencies emerge.
Retrieving and Representing Relevant Information: The MSA demonstrated its ability to retrieve reasonable descriptions of variables and causal dependencies from natural language and formalize them into functional probabilistic programs.

The research suggests that while large language models are excellent at retrieving world knowledge, they may be less adept at integrating that evidence into a locally coherent world model in the way humans do. The explicit causal and probabilistic representations within MSA appear to force the model to focus on deeper structural properties rather than just superficial linguistic features, leading to a better fit with human reasoning.

This work offers a promising path toward understanding and replicating human reasoning in open-ended domains, bridging the gap between the flexibility of large language models and the coherence of structured probabilistic models. For more details, you can refer to the full research paper available at arXiv.org.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

How AI Learns to Think Like Humans in Novel Contexts

Introducing the Model Synthesis Architecture (MSA)

Evaluating MSA: The ‘Model Olympics’

Key Findings and Implications

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates