Unlocking User Preferences: Simulating Watch Histories for Video Personalization

TLDR: A new research paper introduces HIPPO-VIDEO, a dataset created using an LLM-based simulator to generate realistic user watch histories for personalized video highlighting. The paper also proposes HiPHer, a method that leverages these histories to predict user-specific video highlights, outperforming traditional generic or query-based approaches by better capturing complex user preferences.

In today’s digital age, the sheer volume of video content available is overwhelming. From educational tutorials to entertainment, users are constantly searching for relevant information. However, what one person finds important in a video might be completely different from another’s preference. This highlights a critical need for personalized video highlighting, a task that aims to identify and present the most relevant segments of a video tailored to an individual user’s interests.

Traditional video summarization and highlight detection methods often fall short in this regard. They typically rely on generic approaches or simple text queries, which fail to capture the complex and evolving nature of human preferences. Imagine trying to summarize a long documentary for someone interested in historical facts versus someone focused on cinematic techniques – a one-size-fits-all approach simply doesn’t work.

To address this challenge, researchers Jeongeun Lee, Youngjae Yu, and Dongha Lee from Yonsei University have introduced a groundbreaking new dataset called HIPPO-VIDEO. This dataset is designed specifically for personalized video highlighting and was created using an innovative approach: an AI-powered user simulator based on Large Language Models (LLMs). This simulator generates realistic ‘watch histories’ that reflect diverse user preferences, overcoming the privacy concerns and resource limitations associated with collecting real user data.

The HIPPO-VIDEO dataset is substantial, comprising 2,040 pairs of (watch history, saliency score), encompassing a total of 20,400 videos across 170 semantic categories. Each watch history consists of 10 videos, providing a rich context for understanding user interests. The LLM-based simulator mimics real user behavior by iteratively updating preferences as it ‘watches’ videos. This process involves initializing user profiles with specific topics and intents, retrieving video candidates (either related videos or new search queries), engaging with videos by selecting the most and least preferred ones, and then dynamically updating its long-term preferences based on these interactions.

After simulating a watch history, the last video in the sequence becomes the ‘target video’ for saliency annotation. The simulator then assigns a relevance score from 1 to 10 to each segment of this target video. These scores are determined by integrating the simulator’s final long-term preferences and its personal reviews of the video, ensuring that the highlights are truly aligned with the inferred user interests.

To validate the realism and reliability of HIPPO-VIDEO, the researchers conducted extensive human verification studies. Human annotators assessed the plausibility of the simulator’s generated queries and video selections. Remarkably, 97.56% of the queries were deemed reasonable, and the simulator’s video choices matched human selections in over 71% of cases. Further tests using advanced AI models like GPT-4 showed that simulated watch histories were often indistinguishable from real ones, achieving only 40% accuracy in binary classification, which is below a random baseline. This strong validation underscores the dataset’s potential as a reliable proxy for real-world user behavior.

Alongside the dataset, the researchers also propose a method called HiPHer (History-Driven Preference-Aware Video Highlighter). HiPHer leverages these personalized watch histories to predict segment-wise saliency scores. By deriving a global preference embedding from the watch history and using cross-attention to guide segment representations, HiPHer significantly outperforms existing generic and query-based approaches in experiments. This demonstrates the power of incorporating detailed user histories for more effective and user-centric video highlighting in practical scenarios.

Also Read:

The findings from this research emphasize the critical role of history-driven preference modeling for personalized video experiences. By moving beyond simple queries or generic summaries, HIPPO-VIDEO and HiPHer pave the way for more intelligent and user-adaptive video content delivery systems. For more technical details, you can refer to the full research paper: HIPPO-VIDEO: Simulating Watch Histories with Large Language Models for Personalized Video Highlighting.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Unlocking User Preferences: Simulating Watch Histories for Video Personalization

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates