spot_img
HomeResearch & DevelopmentLLMs in Social Simulation: How AI Agents Mimic Human...

LLMs in Social Simulation: How AI Agents Mimic Human Group Dynamics

TLDR: A new study investigates how Large Language Model (LLM)-based multi-agent simulations can replicate human social dynamics like conformity, group polarization, and fragmentation in online forum settings. Researchers found that smaller LLMs tend to conform more, while models optimized for reasoning are more resistant to social influence and maintain diverse opinions. The findings suggest that model choice should align with the desired simulation outcome, whether it’s observing consensus or persistent dissent.

Recent advancements in Large Language Models (LLMs) are opening new doors for understanding complex human social interactions. A new research paper explores whether multi-agent simulations powered by LLMs can accurately reproduce core human social dynamics observed in online forums, such as conformity, group polarization, and fragmentation.

The study, titled “Towards Simulating Social Influence Dynamics with LLM-based Multi-agents,” was conducted by researchers from the Department of Information Management at National Sun Yat-Sen University in Kaohsiung, Taiwan. The team included Hsien-Tsung Lin, Chan Hsu, Pei-Cing Huang, Pei-Xuan Shieh, Chan-Tung Ku, and Yihuang Kang. Their work investigates how different LLM scales and reasoning capabilities influence these social phenomena within a structured simulation framework.

Simulating Social Dynamics with AI Agents

The researchers designed a robust multi-agent conversational environment that mimics the asynchronous interaction patterns typical of Bulletin Board Systems (BBS) forums. In this setup, a central manager orchestrates message exchanges in a round-robin fashion, with each agent posting in sequence and all messages broadcast to every participant. Each agent was given a structured persona, including demographic attributes, communication style, and a fixed initial stance on a controversial topic, like whether governments should adopt stringent environmental policies. The interactions proceeded through five rounds of posting, allowing agents to reference and respond to previous messages, simulating a live forum thread.

To ensure the reliability of their findings, each simulation setting was repeated 25 times, and the results were aggregated to observe overall patterns in conformity rates, polarization changes, and fragmentation. The study focused on three key social phenomena:

  • Conformity: How individuals adjust their opinions to align with the majority view. In the simulation, a “conforming stance change” occurred when an agent’s shift in position brought it closer to the prevailing group stance.
  • Group Polarization: The tendency for initial moderate positions to become more extreme over time through interaction. Agent stances were tracked on a five-point scale from “Strongly Oppose” to “Strongly Support.”
  • Group Fragmentation: When participants split into distinct subgroups holding fundamentally opposing positions rather than converging on a consensus. This was measured by the balance between agents supporting and opposing the proposition.

Key Findings on Model Behavior

The experiments categorized LLMs into four groups based on their parameter scales, computational requirements, and reasoning features:

  • Group A (Smaller Models): Operable on a single GPU, balancing accessibility with linguistic competence (e.g., Qwen2.5-7B, Llama3.1-7B).
  • Group B (Mid-sized Models): Higher capacity but still feasible for limited computing resources (e.g., Qwen2.5-72b, Llama3.1-70B).
  • Group C (Proprietary LLMs): Widely adopted models like GPT-4o, Claude 3.5 Haiku, and Gemini Flash 2.0.
  • Group D (Reasoning-Oriented Models): Architectures explicitly designed or fine-tuned for logical inference and reasoning (e.g., GPT-o1-mini, Deepseek-R1).

The findings revealed interesting patterns in social alignment. Models in Groups A, B, and C generally showed moderate responsiveness to peer influence, with conformity rates typically between 10-20%. Notably, ChatGPT-4o in Group C exhibited the highest conformity rate at 19.45%, suggesting that some larger generative models might be more susceptible to majority alignment.

In stark contrast, reasoning-oriented models in Group D displayed significantly lower conformity rates, with ChatGPT-o1-mini showing just 3.13%. This indicates that models optimized for reasoning have a stronger capacity to maintain their initial viewpoints under social pressure, likely due to more consistent internal reasoning processes.

Regarding stance evolution, Groups A and B showed higher polarization changes and lower fragmentation, suggesting they were more open to external influence and tended to converge towards “support” or “strongly support” stances. This implies that smaller or mid-sized models with limited reasoning capabilities might lean towards consensus. However, some models like Qwen2.5-72b and Qwen2.5-7b within these groups still showed notable fragmentation, indicating their ability to preserve dissent under certain conditions.

Group C models exhibited the lowest overall polarization change, suggesting stronger resilience against extreme stance shifts. While ChatGPT-4o in this group showed low fragmentation, indicating a convergence towards supportive stances, the advanced architectures generally maintained more consistent viewpoints.

Finally, Group D, the reasoning-focused models, consistently maintained a subset of agents in the “strongly oppose” category. This highlights that these models can hold firm adversarial stances even when the broader conversation trends are supportive. Fragmentation was also prominent in Group D, with a clear split between “strongly support” and “strongly oppose,” demonstrating that logic-centric designs retain diverse opinions and allow dissenting views to persist alongside majority positions.

Also Read:

Implications for AI and Social Science

The research demonstrates that LLM-based multi-agent simulations can effectively reproduce social phenomena like moderate conformity, group polarization, and persistent dissent. The stability and fragmentation observed in reasoning-focused models suggest their suitability for applications requiring stance durability or viewpoint heterogeneity, such as AI agents designed for deliberation or argumentation.

Conversely, mid-sized or large generative models appear more prone to aligning with the majority, especially when repeated interactions foster a perceived consensus. This implies that researchers aiming to simulate extreme stance shifts or group consensus might prefer LLMs with simpler generative capacities, while those studying persistent dissent or strongly defended positions might opt for more reasoning-focused LLMs.

In essence, the choice of LLM for social simulations should align with the specific research goal: whether to observe realistic opinion shifts and consensus formation or to maintain heterogeneity and allow contrarian stances to flourish. This study provides valuable insights for both computational social science and the development of agentic AI. You can read the full research paper here.

Rhea Bhattacharya
Rhea Bhattacharyahttps://blogs.edgentiq.com
Rhea Bhattacharya is an AI correspondent with a keen eye for cultural, social, and ethical trends in Generative AI. With a background in sociology and digital ethics, she delivers high-context stories that explore the intersection of AI with everyday lives, governance, and global equity. Her news coverage is analytical, human-centric, and always ahead of the curve. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -