Enhancing Customer Service: A Multi-Agent System to Combat AI Hallucinations

TLDR: This research paper introduces a multi-agent system that integrates Large Language Models (LLMs) with fuzzy logic to mitigate the risk of hallucinations in customer service interactions, specifically for SMS requests. The system decomposes message processing into tasks handled by specialized agents (e.g., Orchestration, Renewal, Evaluator, LLM, Validator, Router, Expert Agents). It employs fuzzy logic to assess confidence in understanding and uses cross-validation and rule-based comparisons to detect and address hallucinated information, aiming to improve accuracy and reliability in AI-driven customer support.

Large Language Models (LLMs) are transforming customer service by enabling systems to understand and respond to customer requests more effectively. However, a significant challenge remains: the risk of ‘hallucination,’ where LLMs generate incorrect or fictitious information as facts. This issue can lead to serious consequences, as seen in recent legal cases where companies were held accountable for false advice provided by their chatbots.

To address this critical problem, a new research paper introduces a multi-agent system designed to handle customer requests, particularly those sent via SMS, while actively working to reduce the risk of LLM hallucinations. The system integrates LLM-based agents with fuzzy logic, a method of reasoning that deals with approximate rather than precise values, to enhance its ability to detect and mitigate these errors.

How the System Works: A Multi-Agent Approach

The proposed architecture breaks down the complex task of processing customer messages into smaller, manageable sub-problems, each handled by a specialized intelligent agent. This modular design allows for different AI technologies, such as LLMs, parsing techniques, and fuzzy logic, to be used where they are most effective.

When a customer sends an SMS, it first goes through an Incoming SMS service. This service authenticates the user and places the message into an event hub. An Orchestration Agent then takes over, dynamically creating specific services to match the message’s attributes and dispatching it to the appropriate Orchestration Worker Agent, such as a Renewal Agent for prescription renewal requests.

The Renewal Agent uses a combination of regular expressions and fuzzy logic to interpret the message. It identifies keywords (like ‘renew’ or ‘stop’) and calculates a ‘degree of confidence’ – a fuzzy variable indicating how well it understood the message. If the message is straightforward and fully understood, it’s processed directly. However, if there’s any ambiguity or unmatched words, the system proceeds to further validation steps.

An Arbitrator Agent then steps in, forwarding messages to an Evaluator Agent. The Evaluator Agent uses fuzzy rules, considering both the ‘degree of confidence’ from the Renewal Agent and a ‘customer importance’ score (derived from customer history), to decide the next action. If confidence is low, the message might be sent to an LLM Agent for deeper interpretation or, in some cases, the customer might be prompted to call support.

The LLM Agent, powered by models like Gemini or ChatGPT, extracts keywords, complaints, and requests from the message. This is where hallucination risk is highest, so a crucial component, the Validator Agent, comes into play. The Validator Agent compares the keywords extracted by the LLM with those identified by the more reliable, rule-based Renewal Agent. If discrepancies are found, indicating a potential hallucination, the LLM’s response is flagged or even discarded. For complaints and requests, the system uses a cross-validation technique, where one LLM evaluates the output of another to ensure accuracy.

Finally, a Router Agent directs validated requests and complaints to specialized Expert Agents, such as a Pharmacist Agent, Store Management Agent, Scheduling Agent, or Complaint Department Agent. These expert agents are equipped to handle specific types of queries, some even using tools to book appointments or retrieve information.

Also Read:

Initial Findings and Future Outlook

Initial tests with sample SMS messages showed that the system successfully extracted relevant keywords and identified instances of hallucination, applying appropriate mitigation strategies. While a comprehensive assessment requires deployment in a real-world production environment, this proof of concept demonstrates a promising approach to building more reliable and trustworthy LLM-powered customer service systems.

This innovative multi-agent architecture, detailed in the paper Using multi-agent architecture to mitigate the risk of LLM hallucinations, offers a robust framework for businesses looking to leverage the power of AI in customer interactions while minimizing the inherent risks of large language models.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enhancing Customer Service: A Multi-Agent System to Combat AI Hallucinations

How the System Works: A Multi-Agent Approach

Initial Findings and Future Outlook

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates