Enhancing AI Agent Tool Selection with Knowledge Graphs for Enterprise Tasks

TLDR: This research introduces a Knowledge Graph (KG)-based framework and an Ensemble of Ego Graphs (EEG) algorithm to improve how AI agents select tools for complex, multi-step tasks in enterprise environments. By modeling semantic relationships and functional dependencies between tools, the method significantly outperforms traditional similarity-based approaches, especially for queries requiring sequential tool composition. A new synthetic dataset of enterprise-specific queries was also developed for evaluation.

In the rapidly evolving landscape of artificial intelligence, AI agents are becoming increasingly vital for automating complex tasks. A core challenge for these agents, especially in large enterprise settings, is efficiently selecting the right tools from a vast and often interconnected array. Traditional methods, which primarily rely on matching user queries with tool descriptions, often fall short when dealing with multi-step requests or tools with hidden dependencies.

Researchers at SAP Labs, Palo Alto, have introduced a novel solution to this problem: a Knowledge Graph (KG)-based framework designed to significantly improve tool retrieval. Their paper, titled “Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning,” addresses the critical need for more accurate and contextual tool selection in complex business environments. The authors, Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, and Sebastian Schreiber, highlight that while AI systems excel at breaking down tasks, their ability to discover and utilize the right tools remains underexplored.

The Challenge of Tool Retrieval in Enterprises

Enterprise environments are characterized by thousands of specialized tools, many with undocumented interdependencies. Current retrieval methods, such as vector-based similarity search, frequently miss relevant tools, leading to fragmented and inefficient task execution. This limitation is particularly critical for general-purpose AI planning systems, where effective tool discovery is the foundation for successful task decomposition and execution.

A Knowledge Graph-Based Solution

The proposed framework tackles these limitations by creating a structured semantic representation of enterprise tools using semi-structured data from tool descriptions and metadata. This forms a knowledge graph that captures the intricate relationships between tools, entities, and parameters. This KG-enhanced mechanism uses neighborhood expansion to uncover implicit connections that traditional methods might overlook.

Key Contributions of the Research

The research makes four significant contributions:

First, it proposes a method to extract and model tool dependencies, which is crucial for understanding how tools relate to each other, even when explicit dependencies are not stated.

Second, it introduces the Ensemble of Ego Graphs (EEG) algorithm. This algorithm leverages an ensemble of 1-hop ego graphs extracted from the overall tool graph. It uses a hybrid node matching and neighborhood expansion technique to significantly boost tool retrieval performance.

Third, motivated by an analysis of real enterprise user queries, the researchers defined six distinct query classes. They developed a novel pipeline for generating multi-step, multi-intent queries aligned with these classes, ensuring the generated queries are coherent and contextually relevant.

Finally, the EEG algorithm’s retrieval efficacy was evaluated on these complex user queries using a specialized metric called CompleteRecall, demonstrating substantial improvements over existing baseline approaches.

How the System Works: Methodology

The system operates in two main phases: an offline phase for building the Knowledge Graph and an online phase for retrieving tools. In the offline phase, tool documentation is processed to extract relational triples (subject, predicate, object), guided by a predefined ontology. These triples are then normalized to ensure consistency and populate a graph database.

In the online phase, when a user query is received, the KG-based retrieval algorithm identifies entry points in the tool graph through both semantic (meaning-based) and textual (keyword-based) matching. A one-hop neighborhood expansion then enriches the candidate tool set by including all tools directly connected to the identified entry points. Finally, a re-ranking model sorts the retrieved tools to present the most relevant ones to the user.

Dataset Generation and Evaluation

To rigorously evaluate their graph-based method, the researchers developed a custom dataset of synthetic user queries, as existing benchmarks did not meet the specific requirements of enterprise use cases. This dataset includes six distinct query classes: Single-Intent, Multi-Intent, Explicit Multi-Step, Implicit Multi-Step, Conditional Multi-Step, and Information Retrieval + Multi-Intent. The synthetic queries were generated through a pipeline that identifies logical tool chains, generates natural-sounding queries, and validates their fidelity.

The experimental results showed that the knowledge graph (KG)-based approach significantly outperformed semantic, lexical, and hybrid retrieval methods. It achieved a micro-average CompleteRecall score of 91.85% at k=10, compared to 89.26% for the strongest non-KG baseline. The most significant gains were observed in conditional multi-step and implicit multi-step query categories, highlighting the KG’s ability to model semantic relationships and uncover hidden connections between tools.

Also Read:

Looking Ahead

While the approach demonstrates strong results, the authors acknowledge limitations, such as the reliance on the quality of the underlying knowledge graph and potential challenges in domains with sparsely described tools. Future work includes implementing a triple validation step to enhance graph quality, making the dataset publicly available, exploring graph embedding techniques, and incorporating tool response information to further improve the tool graph’s utility. This research marks a promising direction for enhancing AI agent capabilities in complex enterprise environments. For more details, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enhancing AI Agent Tool Selection with Knowledge Graphs for Enterprise Tasks

The Challenge of Tool Retrieval in Enterprises

A Knowledge Graph-Based Solution

Key Contributions of the Research

How the System Works: Methodology

Dataset Generation and Evaluation

Looking Ahead

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Astreya Unveils New Wave of Enterprise AI Agents to Boost Business Efficiency and Automation

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates