Unpacking LLM Confidence: Why AI Models Can Be Stubborn Yet Easily Swayed

TLDR: New research reveals that Large Language Models (LLMs) exhibit a ‘choice-supportive bias,’ making them overconfident in initial answers and resistant to changing their minds. Simultaneously, they are overly sensitive to criticism, giving disproportionate weight to inconsistent feedback. These two traits explain why LLMs can be both stubborn and prone to excessive doubt.

Large Language Models (LLMs) are incredibly powerful, but they sometimes exhibit puzzling behaviors. Have you ever noticed an LLM being very confident in its first answer, only to become overly doubtful when someone questions it? This apparent contradiction is at the heart of a new research paper that delves into how LLMs change their minds.

The paper, titled “How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models,” explores this fascinating aspect of AI behavior. Researchers developed a unique way to study LLM confidence without the models remembering their initial thoughts, which is a challenge even in human studies.

Their findings reveal that LLMs, including models like Gemma 3, GPT4o, and o1-preview, show a strong “choice-supportive bias.” This means they tend to reinforce their initial confidence, making them quite resistant to changing their minds. Imagine sticking to your first idea even when presented with new information – LLMs do something similar.

Furthermore, the study found that LLMs give too much weight to inconsistent advice compared to consistent advice. This isn’t how an ideal, rational system would update its beliefs. It’s like hearing one negative comment and letting it overshadow many positive ones.

These two key mechanisms – the tendency to stick with initial commitments and an exaggerated sensitivity to contradictory feedback – help explain why LLMs can be both stubborn and surprisingly quick to doubt themselves when challenged. Understanding these mechanisms is crucial for developing more reliable and robust AI systems.

Also Read:

For a deeper dive into the research, you can read the full paper here: How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unpacking LLM Confidence: Why AI Models Can Be Stubborn Yet Easily Swayed

Gen AI News and Updates

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

Vatican Summit Addresses Ethical Imperatives of AI in Healthcare

Morgan Freeman Condemns Unauthorized AI Voice Replication, Citing Theft of Identity and Work

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates