Building a Smart Event Assistant: Adobe's Human-in-the-Loop Approach to AI Concierge

TLDR: This research paper details the development and evaluation of ‘Summit Concierge,’ an AI assistant for Adobe Summit. It addresses challenges like data scarcity, quality assurance, and rapid deployment by employing a human-in-the-loop development paradigm. The system uses both structured and unstructured data sources, with LLM-assisted and human-validated methods for question generation and comprehensive evaluation strategies. The real-world deployment demonstrated improved user experience and reduced operational burden, showcasing the effectiveness of agile, feedback-driven AI development.

Generative AI assistants are rapidly transforming how enterprises operate, promising significant boosts in productivity, easier access to information, and improved user experiences. A recent paper from Adobe Inc. delves into the development and evaluation of one such assistant, named Summit Concierge, specifically designed for the Adobe Summit event.

The Summit Concierge is a specialized AI assistant built to handle a wide array of event-related questions, from session recommendations and speaker details to venue logistics and agenda searches. It operates under real-world constraints, including limited historical data (a ‘cold-start’ scenario), the need for high-quality, accurate responses, and a tight deadline for deployment. To tackle these challenges, the Adobe team adopted a ‘human-in-the-loop’ development approach, combining advanced prompt engineering, retrieval-augmented generation (RAG), and continuous human validation.

The paper highlights that an agile, feedback-driven development process is crucial for creating scalable and reliable AI assistants, especially when starting with minimal data. The core contributions of their work include a human-in-the-loop workflow for quality assurance, innovative techniques to overcome data sparsity in cold-start situations, and valuable insights gained from its real-world deployment at scale.

How Summit Concierge Works

The system architecture of Summit Concierge is designed to efficiently answer two main types of user queries: general information and specific detail-oriented questions. General queries, often about event logistics or FAQs, are answered using unstructured content like the Adobe Summit guidebook. More specific queries, such as ‘who is {speaker_name}’, tap into structured databases like a knowledge graph or SQL warehouse.

When a user submits a query, it first goes through an autocomplete module and a query rewriting component that uses chat history for clarity. An intent detection module then routes the rewritten query. If it’s an unstructured query, a RAG module retrieves relevant documents to form a response. For structured queries, a natural language-to-SQL (NL2SQL) module generates a database query. The results from either path are then synthesized into the final answer.

The structured data includes information on sessions, speakers, and sponsors, collected via an event management tool called RainFocus. This data is transformed into relational database tables and updated regularly. For personalized queries, like ‘Where is my next session?’, the system queries the RainFocus API directly to ensure privacy and real-time accuracy of attendee schedules.

The unstructured data primarily comes from the ‘ABC Guide’, a comprehensive internal document covering event logistics, venue maps, FAQs, and operational protocols. This guide, originally for support staff, was repurposed for the Concierge. Additional unstructured content includes Adobe product summaries and dynamically generated ‘live-authored’ content based on attendee interactions and feedback during the event.

Generating Questions for Training and Evaluation

A significant part of the human-in-the-loop paradigm involved generating high-quality questions for both evaluating the AI assistant and populating its autocomplete suggestions. For structured data, product managers and marketing teams provided initial ‘seed’ questions, which were then expanded by an LLM to create diverse paraphrases, mimicking concise, mobile-typing styles. They also used a tool called SQLSynth, which generates natural language questions from a database schema, ensuring all questions are ‘in-scope’ and diverse.

For unstructured data, LLM-assisted, human-in-the-loop strategies were used to create questions for evaluation, autocomplete, and follow-up prompts. Text passages were extracted from the ABC Guide, and an LLM generated concise, answerable questions, which human reviewers then vetted for fluency and accuracy. This process ensured that the generated questions were diverse, high-quality, and reflective of real user behavior, especially in a mobile context.

Evaluating Performance

The evaluation of Summit Concierge was multi-faceted. For unstructured data, they used correctness-based scoring, side-by-side comparisons of different model responses, and brand compliance checks. All methods involved LLM judges with chain-of-thought reasoning, calibrated with human alignment, and final human validation.

For structured data, templated queries were used with corresponding ‘gold SQL’ queries to retrieve key facts. An LLM would then compare these facts with the assistant’s generated response, with human annotators reviewing uncertain cases. This significantly reduced manual annotation effort.

The autocomplete system was evaluated based on the ratio of relevant completions and the average number of keystrokes saved for users. Multi-turn conversations, which often involve ambiguous or context-dependent queries, were evaluated using a reasoning-oriented LLM for prompt rewriting, ensuring user intent was maintained across turns.

Also Read:

Real-World Impact

During the Adobe Summit, an internal annotation effort was conducted, with testers submitting queries and providing feedback. Out of 624 interactions reviewed, nearly 70% required no further action, indicating good performance. The remaining actionable errors highlighted areas for targeted improvements, leading to a reduction in incorrectly routed ‘out-of-scope’ queries from 4% to 3%.

The experience with Summit Concierge demonstrated the practical benefits of combining scalable LLM capabilities with lightweight human oversight. This approach led to an improved user experience for attendees and reduced operational overhead for event staff. The methodology is seen as generalizable to other enterprise domains requiring event support, internal knowledge access, or customer service. You can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Building a Smart Event Assistant: Adobe’s Human-in-the-Loop Approach to AI Concierge

How Summit Concierge Works

Generating Questions for Training and Evaluation

Evaluating Performance

Real-World Impact

Gen AI News and Updates

Astreya Unveils New Wave of Enterprise AI Agents to Boost Business Efficiency and Automation

Vida Secures $4 Million Series A Funding to Advance AI Voice Technology and Expand Leadership

Visier Unveils Model Context Protocol (MCP) for AI Agents to Govern People Data Across Enterprises

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates