TLDR: A new study introduces an evaluation framework for Large Language Models (LLMs) simulating human personality. It finds that traditional psychological tests are inadequate and proposes individual and population-level analyses. The research demonstrates a “Scaling Law”: the more detailed and realistic the persona profile provided to an LLM, the more stable, identifiable, and human-consistent the simulated personality becomes. This suggests that rich persona descriptions are crucial for accurate AI personality simulation, outperforming model scale or complex fine-tuning, and offers a path to mitigate biases and improve realism in AI-driven social experiments.
Large Language Models (LLMs) are rapidly transforming various fields, and their potential to simulate human behavior in social experiments is a particularly exciting area. However, accurately evaluating how well these AI models can truly emulate human personality has been a significant challenge. Traditional psychological assessment methods, like confirmatory factor analysis (CFA) and construct validity, often fall short when applied to LLMs, leading to misinterpretations or premature conclusions about their capabilities.
A New Lens for AI Personality
A recent research paper, titled “Scaling Law in LLM Simulated Personality: More Detailed and Realistic Persona Profile Is All You Need,” introduces a novel, end-to-end evaluation framework designed specifically to assess LLMs’ ability to simulate human personality. This framework moves beyond traditional psychometric approaches, focusing instead on capturing the developmental trajectory of LLM-simulated personalities as they evolve towards human-like traits. The core idea is not to determine if an LLM ‘has’ a personality, but how effectively it can ‘simulate’ one through role-playing, guided by specific persona profiles.
Building Virtual Individuals and Populations
The research proposes a two-stage process. First, virtual persona profiles are created by combining real-world census data with LLM-driven generation. This means starting with basic demographic information (like age, gender, occupation) and then using an LLM to enrich these ‘skeletal’ profiles with detailed life experiences, interests, and behavioral characteristics. This process ensures that the virtual personas are both statistically realistic and diverse. Second, these detailed persona profiles are fed to an LLM, which then role-plays and responds to personality assessments, specifically the widely recognized Big Five personality traits using the IPIP-NEO-120 questionnaire.
Individual Personalities: Stability and Uniqueness
At the individual level, the study investigated two crucial aspects: personality stability and identifiability. Stability refers to whether an LLM-driven character can maintain consistent personality traits over repeated assessments. Identifiability, on the other hand, checks if different persona profiles consistently result in distinct individual characteristics. The findings were clear: the level of detail in the persona profile significantly impacts both. More detailed and realistic profiles led to virtual characters exhibiting greater stability and more distinct personalities. Interestingly, for personas that were already quite different from each other, adding more detail had a smaller impact on identifiability, but for those that were initially quite similar, increased detail made a huge difference, highlighting a ‘marginal utility’ effect of persona detail.
Population Trends: Mirroring Human Behavior
Extending the analysis to the population level, the researchers examined whether the age-related trajectories of the Big Five personality traits in virtual populations aligned with empirical human survey data. This is a rigorous benchmark, as these age-related personality patterns are known to be stable across different human populations. The study compared several persona generation strategies:
- Standard LLM-Generated Personas: Basic profiles showed significant deviations from human data, often exhibiting an “overly positive” bias.
- Bias Mitigation: Using “anti-alignment” prompts (e.g., instructing the model not to appear perfect) partially reduced this bias, bringing the curves closer to human baselines.
- Narrative Generation: Framing persona creation as a novel-writing task, encouraging the LLM to build complex, story-driven characters, led to a marked improvement in realism and alignment with human personality curves.
- Literary Personas: Using human-authored literary characters from sources like Wikidata as persona profiles yielded the most significant improvement, with LLM-generated personality distributions closely mirroring real human patterns.
The Scaling Law of LLM Personality
The most profound discovery of this research is the identification of a “Scaling Law in LLM Personality.” This law states that as persona profiles become increasingly detailed and realistic, the discrepancy between LLM-generated personality distributions and real human populations systematically diminishes. The study demonstrated a clear progression: from standard profiles to anti-alignment, then narrative-driven, and finally human-authored characters, the alignment with human personality curves consistently improved. This suggests that the quality of personality simulation is not primarily dependent on the LLM’s scale or complex fine-tuning, but rather on the richness and realism of the information provided in the persona profiles. In essence, “More Detailed and Realistic Persona Profile Is All You Need.”
Also Read:
- Addressing LLM Judge Bias: A New Approach to Reliable Model Evaluation
- Machine Learning’s Challenge to Personality Theory: The Enduring Strength of the Big Five
Implications and Ethical Considerations
This scaling law offers a clear and actionable path for improving LLM-based social simulations. It suggests that many challenges previously reported in LLM personality simulation—such as issues with diversity, bias, sycophancy, alienness, and generalization—can be largely attributed to insufficient persona detail and can be overcome by providing richer, more realistic profiles. The research also acknowledges important ethical risks, including the potential for social control and privacy infringement if large-scale, detailed persona simulations are deployed beyond research contexts. These concerns highlight the need for strong ethical safeguards and transparency in future AI development.
This groundbreaking work provides a robust framework for evaluating and enhancing the capacity of LLMs to simulate human personality, paving the way for more accurate and reliable social science experiments using AI. For more details, you can read the full research paper here.


