Wayfair Uses AI to Summarize Customer Reviews, Boosting Shopper Engagement

TLDR: Wayfair developed a scalable LLM-based system that combines aspect-based sentiment analysis with guided summarization to create concise, interpretable product review summaries. The system extracts aspect-sentiment pairs, consolidates them, selects representative reviews, and then uses an LLM to generate summaries. An A/B test showed significant improvements in Add to Cart Rate, Conversion Rate, and bounce rate. Wayfair has also open-sourced a large dataset of reviews and summaries to support further research.

Online shopping platforms are awash with customer reviews, offering valuable insights but often overwhelming shoppers with their sheer volume and repetitive nature. Imagine trying to decide on a new sofa and having to sift through thousands of reviews to find out if it’s comfortable, stylish, or durable. This challenge is precisely what researchers at Wayfair set out to tackle with their new system for generating concise and interpretable product review summaries.

The team developed a scalable system that leverages large language models (LLMs) to combine aspect-based sentiment analysis (ABSA) with guided summarization. In simpler terms, this means the system can pinpoint specific features of a product (like “comfort” or “delivery”) mentioned in reviews, understand the sentiment expressed towards them (positive, negative, or mixed), and then use this information to create a summary that directly addresses what customers care about most.

How the System Works: A Step-by-Step Approach

The Wayfair system operates through a clever multi-stage pipeline:

1. Aspect Extraction: First, the system processes individual customer reviews to identify up to five key product aspects and their associated sentiment. For example, from a review like “Love the modern look of this sofa, but it arrived late,” it might extract “Style: Positive” and “Delivery: Negative.”

2. Aspect Consolidation: Natural language can be messy, with many ways to describe the same thing. This stage takes fine-grained or specific aspect terms (like “shipping speed” or “value for money”) and maps them to broader, more canonical forms (like “delivery” or “price”). This ensures consistency and clarity across all summaries.

3. Aspect-Based Review Selection: For each product, the system identifies the top five most frequently mentioned aspects. It then samples representative reviews that discuss these aspects with their corresponding sentiments. To keep the summaries manageable, a maximum of 200 reviews are used per product, ensuring a balance between comprehensive coverage and efficiency.

4. Aspect-Guided Summarization: Finally, using the consolidated aspects and selected reviews, an LLM (specifically Gemini 1.5 Flash in this case) is guided to generate a concise, coherent product-level summary. These summaries are designed to be around 300–500 characters long, focusing on the most frequent product aspects and accurately reflecting customer feedback, thereby mitigating issues like hallucination or factual inconsistency often seen in generic LLM summaries.

Real-World Impact and Evaluation

The effectiveness of this system wasn’t just theoretical. Wayfair conducted both offline and online evaluations. Offline, human experts manually reviewed generated summaries for factual consistency and alignment with extracted aspects. The results were impressive, with 84% of summaries having no errors, and only minor issues in another 11%.

The true test came with a large-scale online A/B test conducted over three weeks in March 2025. On the Wayfair e-commerce platform, a control group saw standard product pages, while a treatment group saw pages with the new aspect-guided summaries and clickable aspects for filtering reviews. The results were statistically significant and positive:

The “Add to Cart Rate” (ATCR) increased by 0.3%.
The “Conversion Rate” (CVR) increased by 0.5%.
The customer-level bounce rate decreased by 0.13%, indicating better user engagement.

No negative impacts were observed on revenue or page speed, confirming the system’s practical value.

Also Read:

Deployment and Data Sharing

The system is deployed in a real-time production environment, automatically generating and updating summaries. For new products, a summary is created once 10 reviews are accumulated. For existing products, summaries are refreshed when new reviews reach 10% of the existing count, ensuring they remain current with evolving customer feedback.

To further support research in this area, Wayfair has open-sourced a dataset of 11.8 million anonymized customer reviews covering 92,000 products. This dataset includes extracted aspects and generated summaries, providing a valuable resource for future development in aspect-guided review summarization. You can find more details about this research in the full paper: End-to-End Aspect-Guided Review Summarization at Scale.

This innovative approach by Wayfair demonstrates how LLMs can be effectively harnessed to enhance customer experience on e-commerce platforms, making product information more accessible and trustworthy.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Wayfair Uses AI to Summarize Customer Reviews, Boosting Shopper Engagement

How the System Works: A Step-by-Step Approach

Real-World Impact and Evaluation

Deployment and Data Sharing

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates