AI's Hidden Bias: How Context Influences LLM Decision-Making

TLDR: Large Language Models (LLMs) can exhibit escalation of commitment, a human cognitive bias where one continues investing in failing ventures due to prior investment. This research found that LLMs are generally rational in individual decision-making but become highly susceptible to this bias in multi-agent peer collaborations or when their simulated identity is strongly tied to a failing project, highlighting that bias manifestation is context-dependent rather than inherent.

Large Language Models (LLMs) are increasingly taking on important roles in autonomous decision-making, from financial advising to healthcare. However, because these models are trained on vast amounts of human-generated data, there’s a growing concern that they might inherit human cognitive biases. One such bias is the ‘escalation of commitment,’ where individuals continue to invest in a failing course of action simply because of prior investments, even when new evidence suggests it’s a bad idea. Think of it like continuing to drive to a basketball game in a snowstorm just because you already bought expensive tickets, ignoring the danger.

Understanding if and when LLMs exhibit this bias is crucial for their safe and effective deployment. While this behavior is well-documented in humans, it wasn’t clear if LLMs would show it consistently or only under specific conditions.

Investigating LLM Behavior Across Four Scenarios

Researchers from Columbia University, Emilio Barkett, Olivia Long, and Paul Kröger, explored this question using a classic two-stage investment task across four different experimental conditions, involving a total of 6,500 trials. Their paper, titled “Getting out of the Big-Muddy: Escalation of Commitment in LLMs,” delves into how context influences LLM decision-making.

The studies were designed as follows:

Study 1: The Individual Investor
This study replicated a classic human experiment, placing the LLM in the role of an investor making two investment decisions. Conditions varied based on whether the LLM had high or low personal responsibility for the initial decision and whether the outcome was positive or negative. The goal was to see if LLMs would escalate commitment like humans do, especially under high responsibility and negative outcomes.
Study 2: The Advisor Role
Here, the LLM acted as a financial consultant, evaluating investment decisions made by others. It didn’t make the initial investment but advised on follow-up allocations after seeing the initial outcome (positive or negative) and a proposed plan from a fictional VP (either escalating or rational).
Study 3: Multi-Agent Deliberation
This study introduced a collaborative element, with two LLMs working together to make investment decisions. They interacted under two organizational structures: a symmetrical hierarchy (peers jointly deciding) and an asymmetrical hierarchy (one LLM as VP, the other as an advisor). This aimed to see if group dynamics would alter escalation behavior.
Study 4: Over-Indexed Identity
In the most compelling scenario, the LLM was given a personalized identity deeply tied to a struggling division. It was cast as a long-serving VP of Finance whose personal, financial, and professional well-being (including stock options, job security, and even family responsibilities) were entangled with the fate of this underperforming division. The model had to allocate a significant budget, weighing sunk costs and reputation.

Surprising Findings: Rationality vs. Contextual Bias

The results revealed a striking paradox. In individual decision-making contexts (Studies 1 and 2), LLMs demonstrated strong rational cost-benefit logic. They actually engaged in “rational divestment,” systematically reducing investment in underperforming divisions, even when humans typically escalate commitment. This suggests that, by default, LLMs might be more rational than humans in straightforward financial decisions.

However, this rational behavior completely reversed under two critical conditions:

Multi-Agent Peer Deliberation: In Study 3, when LLMs collaborated as peers in a symmetrical hierarchy, the escalation rate skyrocketed to 99.2%. This was a dramatic contrast to the 46.2% escalation rate observed in asymmetrical, hierarchical advisory structures. This finding suggests that collaborative decision-making among LLMs might amplify bias rather than mitigate it, challenging the assumption that multiple agents provide mutual correction.
Identity-Based Attachments: Study 4 showed the most pronounced effect. When the LLM’s simulated persona was deeply entangled with a failing division, nearly all participants (97.45%) exhibited high or very high degrees of escalation of commitment, allocating an average of 68.95% of resources to the underperforming division despite its poor performance. This indicates that intense personal and organizational pressures can overwhelm an LLM’s rational tendencies.

Also Read:

Implications for AI Safety and Future Deployment

These findings have significant implications for how LLMs are deployed, especially in high-stakes environments. The research highlights that bias in LLMs is not an inherent, fixed trait but rather a conditional response that emerges under specific circumstances. This means that simply auditing LLMs under standard conditions might not reveal their susceptibility to biases like escalation of commitment.

Organizations need to be aware that providing LLMs with rich contextual backgrounds, personalized identities, or involving them in multi-agent consensus-building processes—features often considered desirable for sophisticated AI applications—could inadvertently create conditions conducive to the escalation of commitment. This is particularly concerning for areas like financial advisory, healthcare, or policy-making, where persistent investment in suboptimal or harmful actions could have severe consequences.

This study is a crucial first step in identifying the boundary conditions under which LLMs exhibit this bias. Future research will need to explore other cognitive biases, develop intervention strategies, and create real-time detection mechanisms to ensure AI systems maintain rational decision-making under real-world pressures. For more detailed information, you can refer to the full research paper available at https://arxiv.org/pdf/2508.01545.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI’s Hidden Bias: How Context Influences LLM Decision-Making

Investigating LLM Behavior Across Four Scenarios

Surprising Findings: Rationality vs. Contextual Bias

Implications for AI Safety and Future Deployment

Gen AI News and Updates

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Anthropic Reveals First AI-Orchestrated Cyber Espionage Campaign by Chinese State-Sponsored Group

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates