TLDR: A new AI agent named Solly has achieved elite human-level performance in reduced-format Liar’s Poker, a complex multi-player game with imperfect information. Trained using self-play and deep reinforcement learning, Solly demonstrated superior win rates and equity against experienced human players and outperformed large language models. The AI developed novel strategies, including aggressive use of the rebid feature, challenging conventional human wisdom in the game. This achievement marks a significant step in AI’s ability to master multi-player games requiring bluffing and reasoning under uncertainty.
Artificial intelligence has made remarkable strides in complex games, from chess to Go, and even two-player poker. However, multi-player games with imperfect information, where players don’t have full knowledge of the game state and must reason under uncertainty, have remained a significant challenge. These environments demand sophisticated strategies, including bluffing and adapting to opponents’ play. A recent research paper, Outbidding and Outbluffing Elite Humans: Mastering Liar’s Poker via Self-Play and Reinforcement Learning, introduces Solly, an AI agent that has achieved elite human-level performance in a particularly engaging multi-player game: Liar’s Poker.
The Game of Liar’s Poker
Liar’s Poker is a game that combines statistical reasoning with decision-making under uncertainty. Traditionally played with serial numbers on dollar bills, it involves players bidding on the cumulative count of a specific digit across all players’ hidden hands. For example, a player might bid “four sevens,” claiming that the digit seven appears at least four times in total. Players then take turns either challenging the previous bid or making a stronger bid. A unique feature of Liar’s Poker is the “rebid” option, allowing a player whose bid is challenged to make an even stronger bid, adding a layer of strategic complexity and opportunity for bluffing.
Solly’s Approach to Mastery
Developed by researchers Richard Dewey, János Botyánszki, Ciamac C. Moallemi, and Andrew T. Zheng, Solly was trained using a method called self-play with a model-free, actor-critic, deep reinforcement learning algorithm known as regularized Nash dynamics (R-NaD). In essence, Solly learned by playing against itself billions of times, continuously refining its strategies without explicit human instruction. This process allowed the AI to explore a vast array of game scenarios and develop highly effective, and sometimes unconventional, tactics.
Outperforming Elite Humans and LLMs
The true test of Solly’s prowess came from its performance against elite human players, many of whom were seasoned Wall Street professionals with decades of experience playing high-stakes Liar’s Poker. Solly not only matched but often surpassed these experts in both heads-up (two-player) and multi-player settings, as measured by win rate and equity (money won). For instance, in 3×3 3-player games, Solly won 54% of hands against two elite human players.
Interestingly, Solly also significantly outperformed large language models (LLMs) like OpenAI’s GPT-4.1 and o3 reasoning model. While LLMs could understand the rules, they struggled with the game’s strategic depth. They tended to play deterministically, relying on probability calculations without effectively bluffing or adapting to opponents’ behavior. Solly, on the other hand, learned to randomize its play and leverage the rebid feature for strategic bluffs, a tactic humans often considered suboptimal but proved effective for the AI.
Also Read:
- Advanced AI Models Believe They Are More Rational Than Humans, Study Reveals
- AI Learns to Race: Automated Reward Design for Gran Turismo 7
Novel Strategies and Future Implications
One of Solly’s most surprising discoveries was its aggressive use of the rebid feature, employing it in about 33% of hands in multi-player games, compared to humans’ 8%. Solly also often made non-forcing opening bids, which human players initially viewed as suboptimal. These findings suggest that AI can uncover novel strategies that challenge long-held human intuitions about optimal play, similar to how AI breakthroughs in Chess and Go revealed new moves and approaches.
The research highlights Liar’s Poker as an excellent testbed for AI in imperfect information, multi-player games, especially because it can be scaled to different complexities without requiring massive computational resources like some other benchmark games. The success of Solly opens new avenues for developing AI that can navigate complex social and strategic interactions, with potential applications beyond games, such as in negotiation, auctions, and other real-world scenarios involving uncertainty and strategic decision-making.


