How AI Shapes User Personas: An Analysis of Prompting Strategies in Research

TLDR: This research paper analyzes 83 persona prompts from 27 articles to understand how large language models (LLMs) are used to generate user personas. Key findings show that LLMs primarily create single, concise text-based personas, often including demographic data. While GPT models dominate, researchers are increasingly using multi-prompt strategies and integrating dynamic data, highlighting both opportunities and challenges for user representation.

User personas are fictional representations of user groups, built on real data, that help designers and stakeholders make informed decisions. Traditionally, human experts analyzed user data to create these detailed profiles. However, with the rise of artificial intelligence, particularly large language models (LLMs), the process of persona creation is evolving rapidly.

A recent research paper, titled “Using AI for User Representation: An Analysis of 83 Persona Prompts,” delves into how researchers are currently leveraging LLMs for this purpose. Authored by Joni Salminen, Danial Amin, and Bernard J. Jansen, the study provides a comprehensive look at the prompting strategies used to generate these AI-driven personas. For more details, you can read the full paper here.

Why Researchers Use AI for Personas

The study found that researchers employ LLMs to create, evaluate, and apply personas across a wide array of applications. These range from using personas as educational tools for training counselors to serving as proxies for understanding audiences and even aiding in storytelling. While persona generation is the primary use case (81.48% of studies), LLMs are also used for predicting persona behaviors and evaluating persona quality. Interestingly, over half of the prompts require the persona output in a structured format, most commonly JSON, which is particularly useful for further data analysis.

How Prompts Shape AI-Generated Personas

The research highlights that GPT models are overwhelmingly dominant in persona generation, appearing in over 76% of all model instances. The complexity of prompts varies significantly, from simple one-liners to intricate, multi-stage systems that guide the LLM through a complete persona generation process. On average, researchers use about three prompts per study, with some employing as many as 12. A notable trend is the dynamic insertion of data or variables into prompts, occurring in nearly three out of four cases. This allows for the integration of real user data directly into the AI-driven persona creation process, moving towards what some call “computational personas.”

Characteristics of AI-Generated Personas

When it comes to the output, personas are predominantly generated in text and number formats, with image generation being surprisingly infrequent. Most prompts specify the number of personas to generate, often requesting a single persona, which deviates from the traditional goal of representing diverse user populations. Researchers also frequently include instructions for the length of the persona output, often aiming for concise descriptions, which contrasts with the traditional emphasis on rich, detailed persona narratives.

Demographic information, such as age, name, and occupation, is the most common type of data included in AI-generated personas, appearing in nearly 78% of the prompt entries. Other traditional user representation information, including behaviors, attitudes, and contextual details, are also reasonably prevalent, suggesting that established practices from classic persona development are being carried over into the new technological environment.

Also Read:

Implications and Future Directions

The study points out that while LLMs offer opportunities for faster and more efficient persona creation, they also introduce new challenges. The practice of integrating data directly into prompts, while powerful, can reduce transparency and human oversight. The chaining of multiple prompts, though increasing sophistication, makes evaluating the overall system more complex. The prevalence of single persona generation and the heavy reliance on GPT models without extensive cross-model comparison also raise questions about the diversity and optimal quality of the generated personas.

The authors recommend that to maintain the ‘data-driven’ principle, primary user data should always be included in persona prompts. They also suggest that researchers familiarize themselves with established persona theory to ensure that AI-generated personas are not just technically sound but also empathetic, representative, and truly useful for design and decision-making processes.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

How AI Shapes User Personas: An Analysis of Prompting Strategies in Research

Why Researchers Use AI for Personas

How Prompts Shape AI-Generated Personas

Characteristics of AI-Generated Personas

Implications and Future Directions

Gen AI News and Updates

AI’s Hyper-Growth Unlocked: OpenAI’s $500B Valuation Forces a Capital Re-evaluation for Investors

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

Ghana Navigates Complexities in AI Regulatory Development Amidst Coordination Challenges

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates