Crafting Digital Lives: How AI Generates Realistic Smartphone Usage Data

TLDR: This research explores using Large Language Models (LLMs) like ChatGPT-4o to create synthetic smartphone usage data, addressing challenges in collecting real-world data. The study tested four prompt strategies, finding that detailed prompts improved data structure. While LLMs can generate plausible data for some aspects, capturing the full nuance of human behavior, like sleep patterns and app diversity, remains a challenge, suggesting the need for specific evaluation metrics and further research.

Collecting data on how people use their smartphones offers valuable insights into human behavior and interaction with technology. This information can help us understand mobile device usage, infer behaviors, and even design personalized digital interventions. However, gathering large-scale, real-world smartphone usage logs is incredibly difficult. High costs, significant privacy concerns, unrepresentative user samples, and biases like non-response can all skew results, making it challenging to get an accurate picture.

This is where Large Language Models (LLMs), such as OpenAI’s ChatGPT, come into play, offering a novel approach to generate synthetic smartphone usage data. These AI models are trained on vast amounts of text, including narratives about technology use, enabling them to produce human-like descriptions of user behavior. This method promises low-cost, low-latency data and the potential for broader generalization, which is particularly useful for generating hypotheses and conducting pilot studies.

A recent case study investigated how four different prompt strategies influenced the quality of smartphone usage data generated by the ChatGPT-4o model. The researchers aimed to understand the feasibility of using LLMs for this purpose and to provide insights into effective prompt design and data quality measures. The study focused on two key factors for prompt design: the level of detail provided (whether describing a user persona or expected result characteristics) and the inclusion of seed data (providing an initial real usage example).

The generated synthetic data was evaluated on two main levels: structural compliance and behavioral realism. Structural compliance checked if the data conformed to expected formats, like correct variables and timestamps. Behavioral realism assessed how well the synthetic usage patterns aligned with real-world behaviors, considering aspects like app session lengths, compatibility with circadian rhythms (sleep patterns), and app variety.

The findings suggest that using LLMs to create structured and behaviorally plausible smartphone use datasets is indeed feasible for certain applications, especially when detailed prompts are used. The self-prompting strategies, which involved the LLM itself elaborating on an initial prompt to create a more detailed one, consistently produced structurally compliant datasets. This highlights the potential for non-expert users to generate useful data with these models.

However, mimicking the full complexity of real human behavior remains a significant challenge. None of the tested prompt strategies fully satisfied all behavioral realism criteria. For instance, while one detailed prompt strategy (P4) with seed data generated plausible app usage content and session duration distributions, it failed to produce any long inactivity intervals, which conflicts with typical human sleep cycles. This indicates that structural accuracy or even realism in one metric doesn’t guarantee it across all aspects of human behavior.

The study also observed that synthetic datasets generally had lower app variety compared to real data, particularly when no seed data was provided. This suggests a limitation of LLMs in generating highly diversified usage patterns. A trade-off between novelty and fidelity was also noted: prompts with seed data accurately reproduced the most used applications from the seed but offered less variety, while prompts without seed data introduced a broader mix of apps but with less fidelity to specific usage patterns. The ideal balance here depends on the intended use of the synthetic data; for example, hypothesis generation might benefit from novelty, while modeling a specific user might prioritize fidelity.

Future research directions include experimenting with larger and more diverse seed datasets, potentially including multiple users or atypical usage days to encourage more varied simulations. Exploring other LLMs, including open-source or specialized synthetic data generators, is also crucial. Ultimately, refining evaluation metrics that are specific to the use case will be critical for determining the true usefulness of LLM-generated smartphone usage data.

Also Read:

For more detailed information, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Crafting Digital Lives: How AI Generates Realistic Smartphone Usage Data

Gen AI News and Updates

EBU Academy’s School of AI Honored with European Digital Skills Award for Upskilling Media Professionals

Vesl AI Recognized for AI Infrastructure Innovation with ASOCIO Digital Summit Award

AT&T Unleashes Agentic AI Across Business Operations for Enhanced Efficiency and Innovation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates