TLDR: LookSync is a large-scale visual product search system that connects AI-generated fashion looks with real-world products. It uses a multi-stage pipeline involving AI models like CLIP and LLMs to generate queries, vectorize images, retrieve candidates from a 12 million product catalog, and rerank them for visual and semantic similarity. The system serves over 350,000 AI looks daily, ensuring users can shop for items that closely match their AI-created styles, even if exact products don’t exist.
In the rapidly evolving world of fashion, where generative AI is creating stunning virtual looks and avatars, a new challenge has emerged: how to find real-world products that perfectly match these AI-generated styles. Addressing this, a groundbreaking system called LookSync has been developed and deployed at an internet scale, designed to bridge the gap between AI creativity and tangible retail inventory.
LookSync is an end-to-end product search system that ensures AI-generated fashion looks presented to users are matched with the most visually and semantically similar products from a vast indexed catalog. This system is already serving over 350,000 AI Looks daily, covering diverse product categories across global markets, with access to more than 12 million products.
How LookSync Works: A Multi-Stage Pipeline
The core of LookSync’s innovation lies in its sophisticated four-component search pipeline:
Query Generation: When an AI-generated look is provided, reference images are extracted and fed into a large language model (LLM). This LLM generates detailed search queries that describe the products being worn in the AI look. For instance, it can describe an ‘outermost_topwear’ as a “Men’s charcoal grey polo shirt, solid, button-down collar, long sleeves, straight hem, casualwear.”
Vectorization: Each generated query is then processed by the CLIP (Contrastive Language–Image Pre-Training) model. Specifically, the ViT-H/14 variant, trained on the LAION-2B dataset, is used. CLIP is a multimodal model that aligns text and images in the same vector space, allowing direct comparison between query embeddings and product image embeddings, which are also generated by CLIP during ingestion.
Candidate Retrieval: Using these embeddings, the system queries a vector database to find the closest matching products. After initial retrieval, highly similar products are deduplicated, and hard filters (like brand, size, or price) can be applied based on user preferences.
Reranking: The top candidate products for each group are then passed to an LLM for reranking. This step refines the search results, ensuring that the products most visually and semantically similar to the AI look appear at the top. In cases where the LLM-based reranker encounters issues, a fallback mechanism uses segmentation models like Facebook’s SAM v2 and Microsoft’s Florence to segment individual products from the AI look, embedding them with CLIP, and reranking based on cosine similarity.
Scale and Performance
The LookSync system operates at a massive scale, indexing approximately 12 million products from various retailers across geographies including India, the USA, and Japan. These products span a wide range of categories, from top wear and bottom wear to accessories and footwear. The system continuously ingests new products and updates existing ones in real-time, maintaining an average end-to-end latency of under 1 second for online search requests.
Extensive experiments were conducted to refine product search accuracy, testing various embedding models such as CLIP, FashionCLIP, Fashion SigLIP, and DINOv2. While models like DINOv2 and Fashion-SigLIP showed strengths in fine-grained aspects like color and pattern detection, CLIP consistently emerged as the most reliable performer, balancing visual and semantic matching effectively. Human judges evaluated the recommendation quality using mean opinion scores (MOS), considering factors like color match, fit, sleeve type, fabric type, pattern, and overall look, with CLIP consistently outperforming other models.
Also Read:
- PreferThinker: A New AI System for Understanding Your Unique Image Preferences
- Unlocking Image Diversity: WANDER’s Novel Approach to Text-to-Image Generation
Glance AI Integration
The Product Search System is seamlessly integrated with the Glance AI App. Users can create avatars from their selfies and ‘try on’ AI-generated looks. For each AI-generated look, a shop icon is displayed, allowing users to view visually and semantically similar products from the catalog. Clicking on a product redirects users to a product display page with details like price, stock, and size charts, and a ‘buy now’ option leads to affiliate pages for purchase.
LookSync represents a significant advancement in e-commerce, moving beyond traditional metadata-driven search to handle complex scenarios where exact catalog matches for AI-generated outfits may not exist. By leveraging deep visual embeddings, scalable vector search, and intelligent reranking, it provides relevant alternatives in real-time, enhancing the immersive, AI-driven shopping experience for users globally. You can learn more about this innovative system by reading the full research paper: LookSync: Large-Scale Visual Product Search System for AI-Generated Fashion Looks.


