spot_img
HomeResearch & DevelopmentLearning Abstract Models for Strategic AI in Hidden-Information Games

Learning Abstract Models for Strategic AI in Hidden-Information Games

TLDR: LAMIR (Learned Abstract Model for Imperfect-information Reasoning) is a new AI algorithm that enables agents to perform sophisticated “look-ahead reasoning” in complex games where players have incomplete information (like Poker or Stratego). Unlike previous methods that require explicit game rules, LAMIR learns an abstract model of the game directly from experience. This learned abstraction helps manage the vast number of possible game states, making strategic planning feasible even in very large games where traditional methods fail. Experiments show LAMIR consistently outperforms existing model-free AI agents in various imperfect information games.

Artificial intelligence has made incredible strides in games, achieving superhuman performance in titles like Chess and Go. These are known as ‘perfect information games’ because all players know the complete state of the game at all times. However, the real world, and many popular games like Poker or Stratego, are ‘imperfect information games’ where players operate with incomplete knowledge. Developing AI that can reason and plan effectively in these complex scenarios has been a significant challenge.

A new algorithm called LAMIR (Learned Abstract Model for Imperfect-information Reasoning) is tackling this challenge head-on. Published as a conference paper at ICLR 2026, LAMIR introduces a novel approach to enable sophisticated ‘look-ahead reasoning’ in imperfect information games by learning an abstract model of the game directly from experience. This means the AI doesn’t need to be explicitly programmed with the game’s rules, greatly expanding its applicability.

The Challenge of Imperfect Information

Traditional look-ahead search algorithms, like those used in Chess AI, rely on a complete understanding of the game’s rules to explore future possibilities. While MuZero showed that AI could learn a model of perfect information games implicitly, extending this to imperfect information games is far more difficult. In these games, players must reason about distributions of possible hidden states, not just a single known state. This leads to an explosion in the number of states an AI needs to consider, making planning intractable for even moderately sized games.

How LAMIR Works: Learning and Abstraction

LAMIR addresses these difficulties by focusing on two key innovations: learning a game model and creating an abstraction of that model. The algorithm learns three core functions from observing agent-environment interactions:

  • Representation function: This maps a player’s complex observations (information sets) into a simpler, fixed-size ‘latent representation’.
  • Dynamics function: This predicts how the game state evolves, including the next latent representations for players, immediate rewards, and whether the game ends.
  • Legal actions function: This identifies which actions are permissible from a given latent state.

Crucially, LAMIR also learns an ‘abstract model’. In large imperfect information games, the number of possible information sets can be astronomically high. LAMIR tackles this by clustering similar information sets into a manageable number of ‘abstract information sets’. This learned abstraction limits the size of each ‘subgame’ that the AI needs to consider, making complex look-ahead reasoning feasible. It does this by using a special function (κ) to identify and group information sets that behave similarly, for instance, having similar optimal strategies or leading to similar future states.

Planning with a Learned Abstract Model

During gameplay, LAMIR uses its trained model and abstraction to perform ‘depth-limited solving’. This involves constructing a simplified game tree using the learned dynamics and legal actions, and then employing a technique called Counterfactual Regret Minimization+ (CFR+) to find optimal strategies within this abstracted view. It also integrates a learned value function to estimate outcomes beyond its immediate planning horizon.

Also Read:

Empirical Success in Large Games

The researchers put LAMIR to the test in various imperfect information games. In smaller games, LAMIR’s strategies were shown to be less exploitable than those of concurrently trained Regularized Nash Dynamics (RNaD), a strong baseline. More impressively, in very large games like Imperfect Information Goofspiel 15, where traditional methods struggle due to the sheer number of states (over 10^18 for some decisions), LAMIR consistently outperformed RNaD, achieving win rates of up to 80% in head-to-head play. This demonstrates LAMIR’s ability to scale look-ahead reasoning to previously intractable domains.

While LAMIR represents a significant step forward, the authors acknowledge several areas for future work, including explicitly modeling chance events (like dice rolls in a game), refining the abstraction process, and addressing action space abstraction. Nevertheless, LAMIR stands as a pioneering algorithm, enabling sophisticated AI planning in large-scale imperfect information games without requiring any domain-specific knowledge. You can read the full research paper here: LOOK-AHEAD REASONING WITH A LEARNED MODEL IN IMPERFECT INFORMATION GAMES.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -