Learning Abstract Models for Strategic AI in Hidden-Information Games

TLDR: LAMIR (Learned Abstract Model for Imperfect-information Reasoning) is a new AI algorithm that enables agents to perform sophisticated “look-ahead reasoning” in complex games where players have incomplete information (like Poker or Stratego). Unlike previous methods that require explicit game rules, LAMIR learns an abstract model of the game directly from experience. This learned abstraction helps manage the vast number of possible game states, making strategic planning feasible even in very large games where traditional methods fail. Experiments show LAMIR consistently outperforms existing model-free AI agents in various imperfect information games.

Artificial intelligence has made incredible strides in games, achieving superhuman performance in titles like Chess and Go. These are known as ‘perfect information games’ because all players know the complete state of the game at all times. However, the real world, and many popular games like Poker or Stratego, are ‘imperfect information games’ where players operate with incomplete knowledge. Developing AI that can reason and plan effectively in these complex scenarios has been a significant challenge.

A new algorithm called LAMIR (Learned Abstract Model for Imperfect-information Reasoning) is tackling this challenge head-on. Published as a conference paper at ICLR 2026, LAMIR introduces a novel approach to enable sophisticated ‘look-ahead reasoning’ in imperfect information games by learning an abstract model of the game directly from experience. This means the AI doesn’t need to be explicitly programmed with the game’s rules, greatly expanding its applicability.

The Challenge of Imperfect Information

Traditional look-ahead search algorithms, like those used in Chess AI, rely on a complete understanding of the game’s rules to explore future possibilities. While MuZero showed that AI could learn a model of perfect information games implicitly, extending this to imperfect information games is far more difficult. In these games, players must reason about distributions of possible hidden states, not just a single known state. This leads to an explosion in the number of states an AI needs to consider, making planning intractable for even moderately sized games.

How LAMIR Works: Learning and Abstraction

LAMIR addresses these difficulties by focusing on two key innovations: learning a game model and creating an abstraction of that model. The algorithm learns three core functions from observing agent-environment interactions:

Representation function: This maps a player’s complex observations (information sets) into a simpler, fixed-size ‘latent representation’.
Dynamics function: This predicts how the game state evolves, including the next latent representations for players, immediate rewards, and whether the game ends.
Legal actions function: This identifies which actions are permissible from a given latent state.

Crucially, LAMIR also learns an ‘abstract model’. In large imperfect information games, the number of possible information sets can be astronomically high. LAMIR tackles this by clustering similar information sets into a manageable number of ‘abstract information sets’. This learned abstraction limits the size of each ‘subgame’ that the AI needs to consider, making complex look-ahead reasoning feasible. It does this by using a special function (κ) to identify and group information sets that behave similarly, for instance, having similar optimal strategies or leading to similar future states.

Planning with a Learned Abstract Model

During gameplay, LAMIR uses its trained model and abstraction to perform ‘depth-limited solving’. This involves constructing a simplified game tree using the learned dynamics and legal actions, and then employing a technique called Counterfactual Regret Minimization+ (CFR+) to find optimal strategies within this abstracted view. It also integrates a learned value function to estimate outcomes beyond its immediate planning horizon.

Also Read:

Empirical Success in Large Games

The researchers put LAMIR to the test in various imperfect information games. In smaller games, LAMIR’s strategies were shown to be less exploitable than those of concurrently trained Regularized Nash Dynamics (RNaD), a strong baseline. More impressively, in very large games like Imperfect Information Goofspiel 15, where traditional methods struggle due to the sheer number of states (over 10^18 for some decisions), LAMIR consistently outperformed RNaD, achieving win rates of up to 80% in head-to-head play. This demonstrates LAMIR’s ability to scale look-ahead reasoning to previously intractable domains.

While LAMIR represents a significant step forward, the authors acknowledge several areas for future work, including explicitly modeling chance events (like dice rolls in a game), refining the abstraction process, and addressing action space abstraction. Nevertheless, LAMIR stands as a pioneering algorithm, enabling sophisticated AI planning in large-scale imperfect information games without requiring any domain-specific knowledge. You can read the full research paper here: LOOK-AHEAD REASONING WITH A LEARNED MODEL IN IMPERFECT INFORMATION GAMES.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Learning Abstract Models for Strategic AI in Hidden-Information Games

The Challenge of Imperfect Information

How LAMIR Works: Learning and Abstraction

Planning with a Learned Abstract Model

Empirical Success in Large Games

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates