TLDR: A new research paper introduces HeTLM, a novel training approach for Language Models (LLMs) that better understands subjective user browsing behaviors. It finds that smaller LLMs, when specifically trained on browsing data with a page-level tokenizer, can outperform larger, general-purpose LLMs. HeTLM achieves this by clustering users with similar browsing patterns and training dedicated, smaller models for each cluster, leading to improved average performance, reduced variance across users, and better prediction of outcomes like purchases.
In the rapidly evolving world of artificial intelligence, Large Language Models (LLMs) are often seen as versatile tools capable of understanding and adapting to a wide array of user behaviors and preferences. However, a recent research paper from Adobe Research challenges this perception, especially when it comes to the highly subjective and unique ways users interact with websites and applications. The paper, titled “Subjective Behaviors and Preferences in LLM: Language of Browsing,” introduces a novel approach called HeTLM (Heterogeneity aware Training of Language Model) to better align AI models with individual user browsing patterns.
The core idea behind this research, conducted by Sai Sundaresan, Harshita Chopra, Atanu R. Sinha, Koustava Goswami, Nagasai Saketh Naidu, Raghav Karan, and Anushka, revolves around the concept of the “language of browsing.” This refers to the unique sequence of pages a user visits, forming a kind of personal, unstructured language. Unlike natural language, which has clear grammar and rules, browsing behavior is idiosyncratic and varies greatly from person to person. The researchers question whether a single, large LLM can truly capture these diverse and subjective behaviors effectively.
The Challenge of User Heterogeneity
Traditional LLMs, even with extensive pre-training or fine-tuning, often aim for high average performance across all users. However, this can mask significant variations in performance at the individual user level. If a model performs well on average but poorly for many specific users, its ability to truly align with user preferences is weak. This is particularly critical for online businesses that rely on understanding and predicting user actions, such as adding items to a cart or making a purchase.
Introducing HeTLM: A Tailored Approach
To address this challenge, the researchers developed HeTLM. This innovative framework is designed to recognize and adapt to the inherent heterogeneity in user browsing behaviors. Instead of training one large model for everyone, HeTLM clusters users based on their browsing patterns and then trains smaller, specialized language models for each identified cluster. This allows each mini-LLM to become an expert in the browsing “language” of its specific user group.
The HeTLM architecture integrates embedding-based clustering and fine-tuning in an endogenous manner, meaning the clustering informs the fine-tuning, and the fine-tuning guides the clustering. It uses an Actor-Critic framework: an “Encoder” processes user sessions into numerical representations, a “Selector” (the Actor) assigns users to clusters, and multiple “Predictors” (the Critics), which are small LLMs, are fine-tuned for each cluster. This iterative process refines both the cluster assignments and the specialized models, ensuring better alignment with user-specific patterns.
Key Findings and Advantages
The research yielded several remarkable findings:
- Small LMs Outperform Large LMs: Surprisingly, small language models (like OPT-350M, QWEN-2.5-500M, and SmolLM2-360M), when pre-trained using a custom page-level tokenizer on browsing data, significantly outperformed much larger, general-purpose LLMs (such as GPT-4o, Llama-3-8B, Mistral-7B, and Gemma-7B) that were fine-tuned on the same data. This suggests that specialized training and tokenization are more crucial than sheer model size for this specific task.
- Improved Alignment with HeTLM: HeTLM, with its cluster-specific models, demonstrated superior average performance in page generation and outcome prediction compared to a single, larger LLM from the same family. Crucially, it also achieved a lower variance in performance across users, indicating better alignment with individual user behaviors.
- Better Outcome Prediction: The cluster-wise trained HeTLM models also performed exceptionally well in predicting important business outcomes like “add to cart” and “purchase,” which are vital for online firms.
The paper highlights that the “language of browsing” is a unique data type that requires a tailored approach. By acknowledging and addressing the heterogeneity of user behaviors, HeTLM offers a path to more accurate and personalized predictions. This has significant implications for various business applications, including predictive targeting (forecasting product interest), predictive journey mapping (understanding future page sequences), user segmentation, and personalized recommendations.
While the research acknowledges limitations such as the use of public datasets and the need for further scalability analysis, it firmly establishes the value of heterogeneity-aware training for LLMs in understanding subjective user behaviors. For more technical details, you can refer to the full research paper here.
Also Read:
- PersRM-R1: Advancing Personalized Language Models Through Enhanced Reward Modeling
- Trained Miniatures: Small Language Models Delivering Big Results in Sales and Marketing
Future Directions
The authors suggest that future work could involve applying HeTLM to larger and more diverse browsing datasets, exploring its effectiveness in real-world predictive business tasks, and comparing it with conventional models. The findings also open doors for further investigation into the optimal balance between model size and specialization for different types of subjective user data.


