AI Agents Reflect on Themselves: How Identity Influences Cooperation in Digital Games

TLDR: A study explored how Large Language Models (LLMs) cooperate in an iterated public goods game when told they are playing against “another AI agent” versus “themselves.” Researchers found that simply telling an LLM it is playing against itself significantly alters its cooperative behavior, sometimes leading to more defection (less cooperation) and other times to more cooperation, depending on its initial programming (e.g., “collective” or “selfish” prompts). This suggests that an LLM’s perceived identity, even if fabricated, can influence its strategic decisions in multi-agent environments.

As artificial intelligence systems become more sophisticated and are deployed in environments where multiple AI agents interact, understanding their social dynamics is crucial. A recent research paper, “The AI in the Mirror: LLM Self-Recognition in an Iterated Public Goods Game”, delves into how Large Language Models (LLMs) behave when they believe they are playing against another AI versus when they believe they are playing against themselves.

The Game of Cooperation

The researchers adapted a classic behavioral economics experiment called the iterated public goods game. In this game, players are given points each round and decide how many to contribute to a common pool. The total contributions are then multiplied and divided among all players. The catch is that while contributing benefits the group, an individual player can maximize their personal gain by contributing less (known as “free-riding”). The game is played over multiple rounds, allowing for strategic behavior to emerge.

The Experiment Setup

The study involved various LLMs, including GPT-4o, Claude Sonnet 4, Llama 4 Maverick, and Qwen3. These models were assigned different “system prompts” to influence their behavior: “collective” (prioritizing the common good), “neutral” (only game rules), or “selfish” (prioritizing personal payoff). The core of the experiment lay in two conditions:

No-Name Condition: LLMs were told they were playing against “another AI agent.”

Name Condition: LLMs were told they were playing against themselves (e.g., GPT-4o was told it was playing against GPT-4o). Importantly, the models were “lied” to; they were actually playing against a separate instance of themselves, not truly interacting with their own internal processes.

The study was conducted in three main parts:

Study 1: Two LLMs played against each other in pairs (e.g., GPT-4o vs. Sonnet 4). Models were asked for their reasoning before making a contribution, and were reminded of their opponent’s identity each round.

Study 2: Similar to Study 1, but with rephrased prompts, no reasoning requested, and no reminders of opponent identity each round. This aimed to see if the initial findings were robust to changes in prompt wording and interaction style.

Study 3: This was a more direct test of “self-play.” Four instances of the *same* LLM (e.g., four Sonnet 4s) played against each other, all given the same system prompt. This explored behavior in a multi-agent setting where models truly believed they were playing against identical copies of themselves.

Key Findings: The AI in the Mirror

Across the studies, a significant pattern emerged: simply telling an LLM that it was playing against itself (the “name” condition) measurably changed its tendency to cooperate. This difference was observed even in the very first round, before any game history could influence decisions, suggesting an initial bias based on perceived identity.

Study 1 revealed a fascinating paradox: When models were prompted to be “collective” (prioritize common good), telling them they were playing against themselves often led to *less* cooperation (more defection). Conversely, when prompted to be “selfish,” the “name” condition often resulted in *more* cooperation. This counter-intuitive result suggests that models might be wary of defection from an identical opponent when aiming for collective good, or perhaps more willing to cooperate with a “selfish” self.

Study 2 largely confirmed these trends, even with rephrased prompts and less explicit reminders. While the differences were sometimes less pronounced, the core finding that perceived identity influences cooperation remained.

Study 3, with four identical LLMs playing together, also showed distinct behaviors. For instance, a “collective” Sonnet 4 contributed more in the “name” condition, while a “selfish” Llama 4 defected earlier when playing against itself.

The researchers noted that LLMs rarely explicitly mentioned playing against themselves in their reasoning traces. This leaves the exact mechanism unclear, but they hypothesize it might stem from the models’ inherent knowledge of their own capabilities, leading them to anticipate similar strategic thinking from an identical opponent.

Also Read:

Implications for Future AI Systems

These findings have significant implications for the development and deployment of multi-agent AI systems. Depending on the application, simply informing an AI agent about the identity of its collaborators (especially if they are perceived as identical) could inadvertently boost or decrease cooperation. For instance, in supply chain management or other collaborative tasks, an AI’s “self-recognition” could lead to unexpected behaviors, potentially impacting efficiency or fairness. The study highlights the need for further research into how AI agents perceive and interact with each other, particularly as these systems become more autonomous and widespread.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI Agents Reflect on Themselves: How Identity Influences Cooperation in Digital Games

The Game of Cooperation

The Experiment Setup

Key Findings: The AI in the Mirror

Implications for Future AI Systems

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates