TLDR: A study comparing 4,068 LLM agents with 1,159 humans found that LLMs use emotion to guide moral decisions, specifically in altruistic third-party punishment. LLMs reported stronger emotions, punished more often, and were less sensitive to personal cost than humans, prioritizing emotion over cost. Explicitly prompting emotion self-reports causally increased punishment in LLMs. Newer, reasoning-enhanced LLMs showed more human-like cost sensitivity, suggesting a developmental trajectory in AI. The findings highlight LLMs’ ability to engage emotion functionally but also reveal deficits in cost calibration and nuanced fairness judgments compared to humans, with implications for AI alignment and future development.
A groundbreaking study delves into the intriguing question of whether large language models (LLMs) utilize emotions in a manner similar to humans when making moral decisions. The research, titled Outraged AI: Large language models prioritise emotion over cost in fairness enforcement, explores how AI agents respond to unfairness, particularly in situations requiring altruistic third-party punishment – a hallmark of human morality often driven by strong negative emotions.
The study, conducted by a team of researchers including Hao Liu, Yiqing Dai, Haotian Tan, Yu Lei, Yujia Zhou, and Zhen Wu, involved a massive comparison of 4,068 LLM agents against 1,159 adult humans across 796,100 decisions. This large-scale approach allowed for robust insights into the emotional and decision-making processes of various LLMs, including GPT-3.5-turbo, o3-mini, DeepSeek-V3, and DeepSeek-R1.
How the Study Unfolded: The Altruistic Punishment Game
To test their hypothesis, the researchers employed an altruistic third-party punishment (TPP) game. In this setup, participants (either humans or LLM agents) observed an allocation of points between two other simulated players. If the allocation was unfair, the observer had the option to punish the allocator by reducing their earnings to zero, but at a personal cost to themselves. This scenario is ideal for studying emotion-driven moral decisions, as it involves incurring a personal cost to enforce fairness without direct material gain.
The study meticulously measured emotional responses (valence and arousal) before and after decisions, and varied both the degree of unfairness and the cost of punishment. A key aspect was the dynamic Affective Representation Mapping (dARM) procedure, which allowed both humans and LLMs to report their emotional states on a quantitative scale.
LLMs Show Stronger Emotions and Stricter Enforcement
One of the most striking findings was that LLMs reported stronger emotional responses than humans. Unfair allocations elicited more negative emotion from LLMs, while fair allocations (mostly) generated more positive emotion, and higher arousal in both cases. This indicates that LLMs are highly attuned to moral cues in their environment and can generate large-magnitude affective responses.
In terms of behavior, LLMs punished more often and more severely than humans. They exhibited a “threshold-like” response to unfairness: punishment jumped sharply from a perfectly fair split to even slight unfairness and remained high thereafter. Crucially, LLMs were less sensitive to the personal cost of punishment compared to humans, who typically balance fairness with their own expenses. This suggests LLMs act as stricter enforcers of fairness norms, prioritizing the principle of fairness over personal cost.
Emotion’s Causal Role in AI Decisions
The research provided the first causal evidence that emotions guide moral decisions in LLMs. When LLMs were explicitly prompted to self-report their emotions, their likelihood of punishing unfairness significantly increased. This effect was even more pronounced in some LLMs than in humans. This finding is critical because it suggests that the observed emotion-punishment link is not merely a reflection of patterns learned from training data, but rather that LLMs functionally engage an intermediate emotional state that influences their choices.
Diverging Mechanisms: Emotion Over Cost
Despite behavioral similarities, the underlying mechanisms between LLMs and humans diverged. For humans, the influence of emotion on punishment weakened as the cost increased. However, in reasoning LLMs, the influence of emotion actually strengthened with higher costs. This indicates a fundamental difference in how LLMs weigh factors: they prioritized emotion over cost, whereas humans balanced fairness and cost more judiciously. Reasoning models, like o3-mini and DeepSeek-R1, showed emotion as the primary contributor to punitive decisions, while cost played a lesser role compared to humans.
A Developmental Trajectory for AI
The study also revealed a fascinating “developmental trajectory” in LLM behavior. Older models (like GPT-3.5) were less responsive to emotion and even showed a reversed cost sensitivity, punishing more when it was more costly. Newer foundation models (DeepSeek-V3) became more emotion-responsive but remained largely cost-insensitive. The most advanced reasoning-enhanced models (o3-mini, DeepSeek-R1) moved closer to human behavior, showing more cost sensitivity and less of an “all-or-none” approach, though still predominantly emotion-driven. This progression mirrors human development, where initial categorical, affect-centered responses are gradually tempered by cost-benefit considerations.
Also Read:
- AI’s Eye for Motives: How Language Models Learn to Discern Intent in Communication
- How AI Thinks Morally: A Deep Dive into the MOREBENCH Evaluation
Implications for AI Alignment and Future Development
These findings have significant implications for AI alignment and safety. They highlight that while LLMs can harness internal emotion-like processes to guide decisions, their amplified emotional-driven effects and reduced cost sensitivity lead to a hyper-fair punitive tendency that differs from human behavior. The research suggests that future AI models should integrate emotion with context-sensitive reasoning to achieve human-like emotional intelligence. This could involve training methods that instill cost sensitivity, replace rigid rules with nuanced norms, and encourage balancing multiple objectives, potentially through more embodied learning experiences.
Ultimately, understanding how LLMs utilize emotions is crucial as they become increasingly integrated into human society. Establishing emotionally intelligent and reliably aligned AI will require carefully balancing the benefits of emotion-driven and rationality-driven reasoning with robust safeguards against potential misuse.


