A New Era for Recommender Systems: Meta's Foundation-Expert Paradigm

TLDR: The paper introduces the Foundation-Expert Paradigm and HyperCast infrastructure for deploying hyperscale recommender systems. A large Foundation Model learns general user knowledge and generates “target-aware embeddings,” which are then used by smaller, specialized “Expert” models for specific recommendation tasks. This decoupled approach improves performance, efficiency, and developer agility, and is deployed at Meta serving billions of daily requests.

The world of online content and e-commerce relies heavily on recommender systems – the algorithms that suggest what you might like to watch, buy, or read next. While the concept of “scaling laws” promises significant improvements by making these models larger, actually deploying these massive, or “hyperscale,” models in real-world production environments has been a major hurdle.

Unlike fields such as natural language processing or computer vision, where large Foundation Models (FMs) are already common, recommender systems face unique challenges. These include the need to learn continuously from ever-changing online data, adapt to many different recommendation platforms (each with its own specific goals and data), and meet strict demands for speed and computational efficiency.

Introducing the Foundation-Expert Paradigm

To overcome these obstacles, researchers at Meta Platforms, Inc. have proposed and successfully deployed a novel framework called the Foundation-Expert Paradigm. This approach fundamentally changes how hyperscale recommendation FMs are developed and deployed.

At its core, the paradigm involves two main components:

A Central Foundation Model (FM): This is the “brain” of the system. It’s a large, powerful model trained on a vast amount of user data collected over time, across different platforms, and from various types of content (like text, images, and videos). Its goal is to learn broad, general knowledge about user behavior and preferences. Crucially, this FM generates what are called “target-aware embeddings” for each potential item to be recommended. Unlike older methods that might create a general profile of a user, these embeddings dynamically capture a user’s specific interest in a particular item, given their interaction history.
Lightweight “Expert” Models: These are smaller, specialized models designed for specific recommendation tasks or platforms (e.g., a video recommendation expert, a shopping expert). They receive the target-aware embeddings from the FM as input. Because the heavy lifting of learning general knowledge is handled by the FM, these expert models can be much smaller and focus solely on fine-tuning for their specific local data distributions and optimization goals. This significantly reduces their computational needs and allows for much faster development and iteration.

This decoupled approach means that the complex, resource-intensive FM can be trained and improved independently, while the more agile expert models can be rapidly updated to adapt to new trends or specific platform requirements.

HyperCast: The Enabling Infrastructure

To make this innovative paradigm a reality, Meta built HyperCast, a production-grade infrastructure system. HyperCast re-engineers the entire process of training, serving, logging, and iterating on these models. Key features of HyperCast include:

Decoupled Training: The FM and expert models can be trained completely independently, as the FM’s knowledge is “materialized” (saved) as features for the expert models. This eliminates direct dependencies and speeds up development.
High Freshness: Both the FM and expert models are trained continuously on online streaming data, allowing for model updates within minutes and data-to-trainer latency of about 30 minutes. This ensures recommendations are always based on the latest user interactions.
Multi-tier Inference: HyperCast handles different serving needs (online FM serving for real-time embeddings, offline FM logging for training data, and online expert serving) with specialized, optimized tiers, maximizing hardware efficiency.
Agile Development: The decoupled architecture allows developers to rapidly experiment with expert models without needing to retrain the massive FM, significantly boosting development speed.

Also Read:

Real-World Impact and Results

The Foundation-Expert paradigm, powered by HyperCast, has been deployed across several core recommendation surfaces at Meta, serving tens of billions of user requests daily. The results have been highly positive:

Significant Performance Gains: Online A/B tests showed statistically significant improvements in user experience metrics, including engagement and consumption, compared to Meta’s previous one-stage production system.
High Knowledge Transfer: The target-aware embeddings proved highly effective, ensuring that performance improvements in the FM were efficiently transferred to the expert models, outperforming traditional user embeddings.
Generalization: The FM demonstrated a powerful ability to generalize, providing benefits to expert tasks even when it wasn’t explicitly trained on those specific objectives.
Efficiency: The system maintained infrastructure efficiency, with end-to-end serving latency and CPU performance remaining neutral despite the increased complexity.

This work represents a significant milestone, being the first successful deployment of a Foundation-Expert paradigm at such a massive scale. It offers a proven, compute-efficient, and developer-friendly blueprint for realizing the full potential of scaling laws in industrial recommender systems. For more technical details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

A New Era for Recommender Systems: Meta’s Foundation-Expert Paradigm

Introducing the Foundation-Expert Paradigm

HyperCast: The Enabling Infrastructure

Real-World Impact and Results

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates