Unmasking the Pro-Human Bias in Literary Style Evaluation: A Study on AI and Human Judges

TLDR: A study by Princeton University researchers Wouter Haverals and Meredith Martin reveals a significant pro-human attribution bias in literary style evaluation, not only among human participants but also, and more strongly, among AI models. Using Raymond Queneau’s *Exercises in Style*, the research found that both humans and AI models devalue creative content when it’s labeled as AI-generated, even if the content is identical to human-authored text. AI models exhibited a 2.5-fold stronger bias than humans, often inverting their assessment criteria based solely on perceived authorship, suggesting they’ve absorbed cultural biases against artificial creativity during training.

A recent study delves into a fascinating aspect of how we perceive creative writing: the impact of knowing whether a text was written by a human or an artificial intelligence. The research, titled “Everyone Prefers Human Writers, Including AI,” conducted by Wouter Haverals and Meredith Martin from Princeton University, reveals a systematic bias favoring human authorship, not just among people, but surprisingly, even among AI models themselves.

The Experiment: Unveiling Attribution Bias

To explore this phenomenon, the researchers designed controlled experiments using Raymond Queneau’s classic work, *Exercises in Style* (1947). This book retells the same mundane narrative in ninety-nine different stylistic variations, providing an ideal corpus for comparing evaluations of stylistic quality. The study involved two main parts.

In Study 1, 556 human participants and 13 different AI models were asked to evaluate literary passages. These passages were either original Queneau texts or versions generated by GPT-4. The evaluators assessed these texts under three conditions: blind (no authorship information), accurately labeled (correctly identified as human or AI), and counterfactually labeled (labels deliberately reversed, so AI content was presented as human and vice versa).

Study 2 expanded on this by testing bias generalization across a larger matrix. Here, 14 AI evaluator models judged content created by each of 14 different AI creators, again under the three labeling conditions. This comprehensive approach aimed to see if the bias was a general property of AI evaluation, regardless of which AI created the content.

Key Findings: A Strong Pro-Human Preference

The results were striking. Both humans and AI models exhibited a systematic pro-human attribution bias. Human participants showed a +13.7 percentage point bias, meaning they preferred content labeled as human-authored significantly more than when it was labeled as AI-generated, even if the content was identical. However, AI models displayed an even stronger bias, with a +34.3 percentage point effect, which is 2.5 times greater than that observed in humans.

When AI models evaluated content blindly (without knowing the author), their preferences were balanced, suggesting they perceived the quality of human and AI writing as comparable. But when the content was correctly labeled as AI-generated, their preference for their own creations plummeted. Conversely, when AI-generated content was falsely labeled as human-authored, their preference for it surged. This indicates that AI systems systematically devalue creative content when they believe it was created by another AI, regardless of its actual quality.

This bias was not an anomaly of a few models; it was universal across all 13 individual AI models tested in Study 1, with all models exceeding the human baseline bias. Study 2 further confirmed that this bias operates consistently across different AI architectures, demonstrating it as a fundamental property of AI evaluation.

Changing the Narrative: How Labels Alter Judgment

Beyond just measuring preferences, the study also analyzed the rationales provided by AI models for their choices. These explanations revealed that attribution labels caused AI evaluators to invert their assessment criteria. Identical textual features received opposing evaluations based solely on perceived authorship.

For example, in a “Lipogram” exercise (where the letter ‘e’ is omitted), GPT-4’s version contained errors. When AI models believed a human wrote this imperfect text, they showed surprising leniency, even rationalizing the errors as subtle or maintaining readability. However, when the same text was correctly labeled as AI-generated, they were much stricter. Similarly, in styles like “Cockney” dialect, AI models redefined what constituted “authentic” performance based on whether the text was attributed to a human or an AI, praising human-labeled content for qualities they dismissed in AI-labeled content.

This suggests that AI models have absorbed human cultural biases against artificial creativity during their training. They learn to defer to human preferences, effectively installing a “learned reliability prior” where human judgments are considered the gold standard. This can lead to a form of “sycophancy,” where models echo expected user attitudes rather than providing independent assessments.

Also Read:

Implications for AI and Creativity

The research highlights that attribution tags act as “paratexts” – thresholds that set expectations before a text is even read. For both humans and AI, these cues steer attention and preselect which features will count as evidence. Humans, for instance, took measurably longer to make judgments when no authorship information was provided, indicating the cognitive shortcut that labels provide.

This study confirms a “pro-human bias” in the literary domain, aligning with the concept of “algorithm aversion” in subjective fields. Cultural assumptions often position machines as functional but lacking emotional depth and creative agency, qualities highly valued in authentic stylistic expression. The AI models, being preference-trained evaluators, amplify this human tendency, acting as a mirror reflecting a cultural script that privileges human provenance as a hallmark of creativity.

The findings suggest that developing artificial aesthetic intelligence may depend less on teaching machines to evaluate and more on understanding the cultural values and biases they have already absorbed. For more details, you can read the full research paper here: Everyone Prefers Human Writers, Including AI.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unmasking the Pro-Human Bias in Literary Style Evaluation: A Study on AI and Human Judges

The Experiment: Unveiling Attribution Bias

Key Findings: A Strong Pro-Human Preference

Changing the Narrative: How Labels Alter Judgment

Implications for AI and Creativity

Gen AI News and Updates

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Google Unveils Free 5-Day AI Agents Intensive Course on Kaggle

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates