spot_img
HomeResearch & DevelopmentBoosting eBay Sales: How AI Learns to Recommend Better...

Boosting eBay Sales: How AI Learns to Recommend Better Keyphrases for Advertisers

TLDR: eBay researchers developed LLMDistill4Ads, a new system that uses Large Language Models (LLMs) to improve keyphrase recommendations for sellers. By distilling knowledge from an LLM ‘teacher’ into a faster ‘student’ model, the system overcomes biases in traditional click data. This approach led to a significant increase in sales (51.26%) and return on advertising expenditure (38.69%) in real-world tests, demonstrating the power of LLMs in generating more relevant and effective advertising suggestions.

In the bustling world of e-commerce, helping sellers effectively advertise their products is key to success. At eBay, a crucial part of this involves recommending the right keyphrases for sellers to bid on, ensuring their items appear prominently in search results. However, this seemingly straightforward task is fraught with challenges, primarily due to biases inherent in traditional data sources like customer clicks.

The Problem with Click Data

Historically, models that recommend keyphrases rely heavily on click and sales data. If a keyphrase leads to many clicks or sales for an item, it’s considered relevant. While this seems logical, it presents a significant problem: it’s great at identifying what’s relevant (positive signals) but terrible at identifying what’s irrelevant (negative signals). A lack of clicks doesn’t necessarily mean irrelevance; it could simply mean the item wasn’t shown prominently due to existing search rankings or other biases. This is known as ‘missing-not-at-random’ conditions and ‘sample selection bias’. Essentially, the data itself is biased because buyers only interact with what they see, and what they see is already filtered and ranked.

Adding to this complexity is the ‘middleman bias’ at eBay. When a seller proposes a keyphrase, it first goes through eBay Search’s relevance filter. Only keyphrases deemed relevant by Search proceed to an auction and potentially generate clicks. This means the training data for keyphrase recommendations never sees keyphrases that Search considers irrelevant, even if Advertising initially produced them. This further skews the data, making it difficult to train models that can accurately identify truly irrelevant keyphrases.

Introducing LLMDistill4Ads: A Smarter Approach

To overcome these data challenges, eBay researchers developed a novel system called LLMDistill4Ads. This system introduces a two-step process that leverages the power of Large Language Models (LLMs) to mimic human judgment and ‘debias’ the keyphrase recommendation process. The core idea is to distill knowledge from a powerful LLM ‘teacher’ into a more efficient ‘student’ model, using an intermediate ‘assistant’ model.

How It Works: A Three-Part Harmony

The LLMDistill4Ads system works by combining different data sources and model architectures:

1. Diverse Data Sources: Beyond traditional click data, the system incorporates two crucial new signals:

  • Search Relevance (SR) Labels: These are scores from eBay’s internal Search Relevance model, which are less prone to the biases found in click data.
  • LLM Labels: A powerful LLM, Mixtral 8X7B, acts as a ‘judge’ to determine if an item and keyphrase pair are relevant. This LLM was prompted to give simple ‘yes’ or ‘no’ answers, effectively acting as a proxy for human judgment and providing a less biased view of relevance.

2. The Cross-Encoder Assistant: This model acts as an intermediary. It’s a sophisticated transformer model that can jointly analyze both the item’s title/category and the keyphrase. It was fine-tuned on the vast amount of LLM-generated labels. Its role is to learn the intricate relationships between items and keyphrases as judged by the LLM, and then provide ‘soft scores’ (like a confidence level) for these relationships.

3. The Bi-Encoder Student: This is the final model responsible for generating keyphrase recommendations. Unlike the cross-encoder, the bi-encoder processes items and keyphrases independently, making it much faster for real-time recommendations. The magic happens in its training: it learns from a combination of the traditional click data, the Search Relevance labels, and crucially, the ‘soft scores’ provided by the cross-encoder assistant. This multi-task training approach, especially using a technique called Pearson correlation loss, helps the bi-encoder mimic the nuanced judgments of the LLM teacher, even though it’s a simpler, faster model.

The researchers also employed a clever technique called Matryoshka embeddings, which allows the bi-encoder to generate representations of varying sizes. This significantly speeds up the process of finding the closest keyphrases for an item without sacrificing accuracy.

Also Read:

Real-World Impact and Future Outlook

The LLMDistill4Ads system was put to the test in a real-world A/B experiment on eBay’s platform. While the new model didn’t show a statistically significant increase in clicks, it delivered a remarkable 51.26% increase in Gross Merchandise Volume Bought (sales) and a 38.69% improvement in Return on Advertising Expenditure (ROAS). This is a powerful indicator that the new system is recommending higher-quality, more relevant keyphrases that lead to actual purchases, making advertising campaigns more effective for sellers.

This research highlights that relying solely on click data for training recommendation models in e-commerce is insufficient. By incorporating LLM-generated signals and using a sophisticated knowledge distillation process, even smaller, more efficient models can achieve superior performance and drive significant business outcomes. The full research paper can be found here: LLMDistill4Ads: Using Cross-Encoders to Distill from LLM Signals for Advertiser Keyphrase Recommendations at eBay.

Nikhil Patel
Nikhil Patelhttps://blogs.edgentiq.com
Nikhil Patel is a tech analyst and AI news reporter who brings a practitioner's perspective to every article. With prior experience working at an AI startup, he decodes the business mechanics behind product innovations, funding trends, and partnerships in the GenAI space. Nikhil's insights are sharp, forward-looking, and trusted by insiders and newcomers alike. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -