TLDR: This research introduces Individualized Cognitive Simulation (ICS) to evaluate how large language models (LLMs) can mimic the unique thought processes and writing styles of specific authors. By testing different cognitive representations like linguistic features, concept mappings, and author profiles, the study found that combining conceptual and linguistic features is most effective. It also highlights that LLMs are better at replicating surface-level linguistic style than deeper narrative structure, suggesting a need for new approaches to achieve more faithful cognitive simulation.
Large Language Models (LLMs) have become incredibly adept at mimicking human-like text, from engaging in role-play to generating creative content. However, a new research paper delves into a more profound challenge: can these AI models truly simulate the unique thought processes of specific individuals? This is the core question behind Individualized Cognitive Simulation (ICS), a novel task introduced by researchers Tianyi Zhang, Xiaolin Zhou, Yunzhe Wang, Erik Cambria, David Traum, and Rui Mao.
The paper, titled “Individualized Cognitive Simulation in Large Language Models: Evaluating Different Cognitive Representation Methods,” explores how LLMs can be guided to approximate an author’s cognitive and stylistic processes in narrative continuation. While LLMs can convincingly reproduce surface-level human behavior, their ability to simulate deeper, individualized cognitive patterns has remained largely unexplored.
To address this gap, the researchers developed a unique evaluation framework. They created a dataset using recently published novels (published after the LLMs’ training cut-off dates) to prevent any data leakage. This dataset allowed them to test how well seven different off-the-shelf LLMs could emulate the authorial style of original texts when provided with various cognitive representations.
Exploring Cognitive Representations
The study investigated three main families of cognitive representations:
- Linguistic Features: These capture an author’s direct writing habits, including vocabulary, sentence structure, semantic themes, and pragmatic tone. Think of them as the author’s unique linguistic fingerprint.
- Concept Mappings: Grounded in conceptual metaphor theory, these reflect how individuals understand abstract ideas through concrete domains (e.g., “time is money”). They aim to capture deeper, often unconscious, structures of meaning-making.
- Author Profiles: This includes biographical and psychological dimensions such as persona (age, nationality, writing habits), background (education, life events), and personality traits (Big Five/OCEAN model). These offer a higher-level view of factors influencing writing style.
The researchers designed eleven experimental conditions, testing each single feature as well as various multi-feature combinations. This systematic approach allowed them to compare the influence of individual cognitive dimensions and their interactions.
Evaluation and Key Findings
The evaluation involved both LLM-based automatic metrics and human judgment. LLMs were used to assess linguistic style and narrative structure, while human evaluators (Literature/English majors) provided blinded ratings on linguistic style fidelity, narrative structure preservation, and overall authorial authenticity.
The results were compelling: the combination of concept mappings and linguistic features consistently achieved the highest overall performance in both automatic and human evaluations. This synergy suggests that while linguistic cues capture surface-level stylistic habits, concept mappings provide deeper insight into how authors conceptualize and structure meaning, offering a complementary pathway to more faithful style emulation.
In contrast, profile-based features (persona, background, personality traits) yielded limited improvements and sometimes even degraded performance when combined with other features. This highlights the difficulty of translating high-level biographical information into effective narrative generation signals.
A significant finding was that LLMs are more effective at mimicking linguistic style than narrative structure. While linguistic ratings were relatively high, structural similarity scores remained uniformly low across all settings, indicating that models struggle with the deeper cognitive aspects of story organization and event progression.
Interestingly, there was a discrepancy in how LLMs and human evaluators perceived the ‘Profile’ feature. LLMs tended to rate its linguistic quality higher than human judges, suggesting potential blind spots in automated linguistic assessment for profile-conditioned content.
Also Read:
- Simulating Human Consciousness in AI: A Psychodynamic Approach with Multi-Agent LLMs
- Bridging Minds and Machines: How AI Language Models Mirror Human Brain Representations
Model Performance and Future Directions
Among the tested models, Google’s Gemini Pro 1.5 performed best overall, excelling in linguistic style. Llama-3.2 3B Instruct, however, achieved the highest structural similarity. The study also observed that while scaling up models (larger parameter counts) generally improved surface-level linguistic style, it did not necessarily lead to better deeper structural similarity. This suggests that simply making models larger isn’t enough for advanced cognitive simulation.
The research concludes that these findings provide a crucial foundation for developing AI systems that can truly adapt to individual ways of thinking and expression. To advance beyond mere stylistic mimicry towards more faithful cognitive simulation, new training, prompting, and decoding methods are needed, rather than just increasing model size. For more details, you can read the full research paper here.


