TLDR: AI Knowledge Assist is a system that automatically builds knowledge bases for conversational AI agents in contact centers. It extracts question-answer pairs from historical customer-agent conversations, deduplicates them, and recommends representative pairs. Using a fine-tuned LLaMA-3.1-8B model, it achieves over 90% accuracy, effectively solving the “cold start” problem for companies without existing knowledge bases and enabling immediate deployment of RAG-powered chatbots.
The rapid advancement of Large Language Models (LLMs) has opened new doors for conversational AI systems, particularly in customer service through Retrieval Augmented Generation (RAG) techniques. However, a significant hurdle for many companies, especially contact centers, is the absence of a dedicated, company-specific knowledge base. This ‘cold start’ problem prevents the effective integration of AI chatbots, as they lack the necessary information to answer customer inquiries.
Addressing this challenge, a new system called AI Knowledge Assist has been introduced. Developed by researchers from Dialpad Inc., including Md Tahmid Rahman Laskar, Julien Bouvier Tremblay, Xue-Yong Fu, Cheng Chen, and Shashi Bhushan TN, this system offers an automated solution for building comprehensive knowledge bases. It achieves this by extracting valuable knowledge in the form of question-answer (QA) pairs directly from historical customer-agent conversations.
How AI Knowledge Assist Works
The AI Knowledge Assist system operates through a sophisticated three-stage pipeline:
1. Knowledge Extraction from Transcripts: The initial step involves analyzing historical call transcripts. An LLM is prompted to identify and extract information-seeking questions from customers and the corresponding answers provided by agents. Since these are often voice transcripts, the LLM also rewrites the QA pairs to ensure they are clear and understandable without needing the full conversation context. This process transforms raw conversational data into structured knowledge.
2. Clustering for Deduplication: Once QA pairs are extracted, redundancy is a common issue, as similar questions and answers may appear across multiple conversations. To manage this, the system employs a clustering algorithm. It measures the semantic similarity between the questions using embeddings and groups semantically similar QA pairs together. This step is crucial for deduplication, preventing the knowledge base from becoming cluttered with redundant entries.
3. Recommending Representative QA Pairs: In the final stage, an LLM processes each cluster of QA pairs. Its task is to select one or more representative QA pairs that best summarize the information within that cluster. This serves a dual purpose: further deduplication and filtering, and recommending well-formed, informative QA pairs for inclusion in the final knowledge base. These recommended pairs can either be automatically added or sent for review by a knowledge manager before final integration.
Performance and Impact
The research paper highlights impressive results from the AI Knowledge Assist system. By fine-tuning a lightweight LLM, specifically the LLaMA-3.1-8B model, on internal company data, the system achieved state-of-the-art performance. Empirical evaluations across 20 companies demonstrated an accuracy of over 90% in extracting information-seeking QA pairs. This high accuracy effectively eliminates the cold start gap in contact centers, enabling the immediate deployment of RAG-powered chatbots.
The fine-tuned model, referred to as Knowledge-Assist-8B-SFT, significantly outperformed larger closed-source LLMs like GPT-4o-Mini and various Gemini Flash models in knowledge extraction and representative QA pair recommendation. Human evaluations further validated these findings, with human experts preferring the QA pairs recommended by the Knowledge-Assist-8B-SFT model in a higher percentage of cases and approving more of its recommendations for the knowledge base.
Also Read:
- QAgent: Enhancing LLMs with Interactive Query Understanding and Adaptive Retrieval
- Boosting AI Performance and Cutting Costs with MHA-RAG Soft Prompts
Real-World Deployment and Future
The system is designed for real-world deployment, utilizing Kubeflow on the Google Vertex AI Platform. This setup allows for the automatic execution of the entire pipeline, from data processing to model inference. A key feature is its ability to self-update by continuously processing new call transcripts. This ensures the knowledge base remains current, adapting to new customer issues and product changes over time.
By automating the creation and maintenance of knowledge bases, AI Knowledge Assist promises to significantly improve customer experience and agent performance. It allows conversational AI agents to resolve customer issues more effectively and efficiently, even for companies starting without an existing knowledge base. For more details, you can read the full research paper here.


