Understanding User Browsing: How Smaller, Specialized AI Models Outperform Larger Ones

TLDR: A new research paper introduces HeTLM, a novel training approach for Language Models (LLMs) that better understands subjective user browsing behaviors. It finds that smaller LLMs, when specifically trained on browsing data with a page-level tokenizer, can outperform larger, general-purpose LLMs. HeTLM achieves this by clustering users with similar browsing patterns and training dedicated, smaller models for each cluster, leading to improved average performance, reduced variance across users, and better prediction of outcomes like purchases.

In the rapidly evolving world of artificial intelligence, Large Language Models (LLMs) are often seen as versatile tools capable of understanding and adapting to a wide array of user behaviors and preferences. However, a recent research paper from Adobe Research challenges this perception, especially when it comes to the highly subjective and unique ways users interact with websites and applications. The paper, titled “Subjective Behaviors and Preferences in LLM: Language of Browsing,” introduces a novel approach called HeTLM (Heterogeneity aware Training of Language Model) to better align AI models with individual user browsing patterns.

The core idea behind this research, conducted by Sai Sundaresan, Harshita Chopra, Atanu R. Sinha, Koustava Goswami, Nagasai Saketh Naidu, Raghav Karan, and Anushka, revolves around the concept of the “language of browsing.” This refers to the unique sequence of pages a user visits, forming a kind of personal, unstructured language. Unlike natural language, which has clear grammar and rules, browsing behavior is idiosyncratic and varies greatly from person to person. The researchers question whether a single, large LLM can truly capture these diverse and subjective behaviors effectively.

The Challenge of User Heterogeneity

Traditional LLMs, even with extensive pre-training or fine-tuning, often aim for high average performance across all users. However, this can mask significant variations in performance at the individual user level. If a model performs well on average but poorly for many specific users, its ability to truly align with user preferences is weak. This is particularly critical for online businesses that rely on understanding and predicting user actions, such as adding items to a cart or making a purchase.

Introducing HeTLM: A Tailored Approach

To address this challenge, the researchers developed HeTLM. This innovative framework is designed to recognize and adapt to the inherent heterogeneity in user browsing behaviors. Instead of training one large model for everyone, HeTLM clusters users based on their browsing patterns and then trains smaller, specialized language models for each identified cluster. This allows each mini-LLM to become an expert in the browsing “language” of its specific user group.

The HeTLM architecture integrates embedding-based clustering and fine-tuning in an endogenous manner, meaning the clustering informs the fine-tuning, and the fine-tuning guides the clustering. It uses an Actor-Critic framework: an “Encoder” processes user sessions into numerical representations, a “Selector” (the Actor) assigns users to clusters, and multiple “Predictors” (the Critics), which are small LLMs, are fine-tuned for each cluster. This iterative process refines both the cluster assignments and the specialized models, ensuring better alignment with user-specific patterns.

Key Findings and Advantages

The research yielded several remarkable findings:

Small LMs Outperform Large LMs: Surprisingly, small language models (like OPT-350M, QWEN-2.5-500M, and SmolLM2-360M), when pre-trained using a custom page-level tokenizer on browsing data, significantly outperformed much larger, general-purpose LLMs (such as GPT-4o, Llama-3-8B, Mistral-7B, and Gemma-7B) that were fine-tuned on the same data. This suggests that specialized training and tokenization are more crucial than sheer model size for this specific task.
Improved Alignment with HeTLM: HeTLM, with its cluster-specific models, demonstrated superior average performance in page generation and outcome prediction compared to a single, larger LLM from the same family. Crucially, it also achieved a lower variance in performance across users, indicating better alignment with individual user behaviors.
Better Outcome Prediction: The cluster-wise trained HeTLM models also performed exceptionally well in predicting important business outcomes like “add to cart” and “purchase,” which are vital for online firms.

The paper highlights that the “language of browsing” is a unique data type that requires a tailored approach. By acknowledging and addressing the heterogeneity of user behaviors, HeTLM offers a path to more accurate and personalized predictions. This has significant implications for various business applications, including predictive targeting (forecasting product interest), predictive journey mapping (understanding future page sequences), user segmentation, and personalized recommendations.

While the research acknowledges limitations such as the use of public datasets and the need for further scalability analysis, it firmly establishes the value of heterogeneity-aware training for LLMs in understanding subjective user behaviors. For more technical details, you can refer to the full research paper here.

Also Read:

Future Directions

The authors suggest that future work could involve applying HeTLM to larger and more diverse browsing datasets, exploring its effectiveness in real-world predictive business tasks, and comparing it with conventional models. The findings also open doors for further investigation into the optimal balance between model size and specialization for different types of subjective user data.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Understanding User Browsing: How Smaller, Specialized AI Models Outperform Larger Ones

The Challenge of User Heterogeneity

Introducing HeTLM: A Tailored Approach

Key Findings and Advantages

Future Directions

Gen AI News and Updates

Google DeepMind Unveils SIMA 2: An Advanced AI Agent for Virtual 3D Worlds

Microsoft Research Unveils Project Gecko to Advance Equitable Multilingual AI for Global Communities

A New Way to Disentangle Data for Scientific Exploration

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates