Simulating Public Opinion with AI Agents

TLDR: The paper introduces P2P, a novel framework that uses a compact ensemble of large language models (LLMs) to emulate human preferences for surveys. It addresses challenges like declining survey participation and high costs by treating LLMs as agent proxies. P2P works in two stages: actively generating diverse “endowment” personas using structured prompts and then aggregating their responses via regression to match aggregate population preferences, without relying on demographic data. Experiments on real-world survey data show P2P can accurately reproduce aggregate response patterns and foster response diversity.

In an era where traditional surveys face mounting challenges like declining participation and rising costs, a groundbreaking new framework called Prompts to Proxies (P2P) offers a promising solution. This innovative approach leverages the power of large language models (LLMs) to emulate human preferences, effectively treating them as agent proxies for survey respondents. The research, detailed in the paper PROMPTS TOPROXIES: EMULATINGHUMANPREFERENCES VIA ACOMPACTLLM ENSEMBLE by Bingchen Wang, Zi-Yu Khoo, and Bryan Kian Hsiang Low, aims to provide a cost-effective and steerable method for social scientists to model public opinion.

The core idea behind P2P is inspired by the economic theory of revealed preference, which suggests that preferences, though hidden, can be inferred from observable choices. Instead of trying to perfectly replicate individual human preferences, P2P focuses on recovering the aggregate preference structure of a target population. This is achieved through a two-stage alignment process.

Stage 1: Active Endowment Generation

The first stage involves creating a diverse and expressive set of LLM agents, each representing a distinct persona or ‘endowment’. These endowments are generated using a structured ‘attribute bank’ that draws from various social science theories, including demographics, values, cognitive biases, and political ideologies. These attributes act as ‘control handles’ to steer the LLM’s behavior. The system also includes an ‘attribute learner’ LLM that can dynamically infer relevant attributes directly from the survey questions themselves, ensuring adaptability to different topics.

To ensure diversity without exhaustively creating every possible persona, P2P employs an active endowment generation procedure. This iterative process samples new endowments, uses them to elicit responses to survey questions, and then evaluates their contribution to response diversity using a ‘variability score’ based on question entropy. This score helps the system identify which types of endowments are most effective at generating varied responses, guiding subsequent generations. Strategies like ‘question patching’ and ‘mixed mode’ further enhance diversity by targeting questions with low response variability or combining attributes from different modes.

Stage 2: Regression-Based Aggregation

Once a candidate pool of diverse proxy agents is assembled, the second stage focuses on aggregating their responses to reconstruct the observed population preferences. This is framed as a supervised learning problem. Given aggregate responses for a subset of survey questions, P2P uses regression methods, specifically constrained lasso or elastic net, to assign weights to each proxy agent. These weights reflect how much each agent contributes to approximating the overall population’s responses.

The goal is to find a compact, representative ensemble of agents whose weighted responses accurately match the aggregate ground-truth data. This method is particularly powerful because it is ‘demographic-agnostic,’ meaning it doesn’t require explicit demographic data for alignment, relying instead on aggregate survey results for generalizability and parsimony.

Also Read:

Real-World Validation and Broader Impact

The researchers tested P2P on real-world opinion survey datasets from the American Trends Panel (ATP) Wave 42, which explores public trust in science. The results were encouraging: the aligned agent populations could reproduce aggregate response patterns with high fidelity, achieving low root mean squared error (RMSE) in predictions. The system also demonstrated substantial response diversity, even without demographic conditioning.

Beyond improving data efficiency in social science research, P2P offers a valuable testbed for studying ‘pluralistic alignment’ – the idea of representing a spectrum of human values rather than a single ideal. It provides a controlled environment for evaluating different prompt engineering strategies and understanding how they influence agent diversity and performance. The framework also opens doors for social scientists to simulate and test preference theories, support survey design, and mitigate nonresponse bias, serving as a cost-effective complement to traditional data collection.

The P2P framework represents a significant step forward in leveraging LLMs for social science research, offering a scalable, flexible, and theoretically grounded alternative for modeling human preferences in a complex and evolving world.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Simulating Public Opinion with AI Agents

Stage 1: Active Endowment Generation

Stage 2: Regression-Based Aggregation

Real-World Validation and Broader Impact

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates