TLDR: This research paper introduces an autonomous FinOps agent designed to optimize IT infrastructure and costs. It addresses the challenge of fragmented billing data from multiple cloud providers by leveraging Agentic AI, a unified GraphQL schema, and an NL2GraphQL layer. The agent employs a multi-agent architecture (Planning, Data Retrieval, Analysis) to interpret natural language queries, gather data, and generate optimization recommendations. Evaluations show that advanced LLMs enable the agent to perform complex FinOps tasks effectively, highlighting the importance of tool recognition and domain-specific architectural choices over raw model size.
In the rapidly evolving landscape of cloud computing, managing IT infrastructure costs has become a significant challenge for many organizations. The ease of deploying and scaling resources, combined with dynamic pricing models from various cloud providers, has led to unprecedented complexity in financial oversight. This complexity gave birth to FinOps, a discipline that merges Finance and Operations to maximize cloud business value through collaborative financial accountability.
FinOps operates through three iterative phases: Inform (Visibility & Allocation), Optimize (Rates & Usage), and Operate (Continuous Improvement & Usage). However, practitioners often face a fundamental hurdle: billing data arrives in diverse formats and metrics from multiple cloud providers and internal systems. This fragmentation makes it difficult to synthesize actionable insights and make timely decisions, which are crucial for effective cost management.
A recent research paper, FinOps Agent – A Use-Case for IT Infrastructure and Cost Optimization, by Ngoc Phuoc An Vo, Manish Kesarwani, Ruchi Mahindru, and Chandrasekhar Narayanaswami, proposes a transformative solution: leveraging autonomous, goal-driven AI agents for FinOps automation. The authors built a FinOps agent designed to simulate a realistic end-to-end industry process, from retrieving data across various sources to consolidating, analyzing, and generating optimization recommendations.
The Power of Agentic AI in FinOps
Agentic AI introduces autonomous agents capable of goal-directed behavior, strategic planning, and adaptive learning. Unlike traditional systems that merely react to direct prompts, these agents can formulate multi-step plans, learn from feedback, execute actions through tool integration, retrieve data, maintain memory, and adapt strategies based on observed conditions. This capability is particularly valuable for FinOps, enabling proactive optimization and detailed reactive analysis.
For instance, an AI agent could continuously monitor cloud spending, autonomously investigate irregularities, correlate costs with business events, and execute approved optimizations. If an unusual spike in compute costs is detected, the agent can trace it to specific services, analyze recent deployments, identify inefficient resource configurations, evaluate optimization strategies, and either present detailed recommendations or implement corrective actions, all while maintaining an audit trail.
Addressing Data Fragmentation with GraphQL
A core challenge in FinOps is the fragmented nature of data across multiple cloud providers and vendor platforms, each with distinct metrics and access patterns. To overcome this, the FinOps agent utilizes GraphQL, a query language for APIs. GraphQL allows the agent to federate these disparate data sources through a single, schema-driven endpoint. This means the agent can request precisely the data it needs in a single query, reducing data over-fetching, lowering latency, and supporting time-sensitive FinOps decisions.
The paper highlights the creation of a unified GraphQL schema that abstracts data from systems like IBM Turbonomic (for resource utilization) and IBM Apptio (for spending anomalies). This schema acts as a common language, enabling the agent to access and integrate data seamlessly, even when underlying systems use different terminologies.
Natural Language to GraphQL (NL2GraphQL)
To make the agent accessible and intuitive, an NL2GraphQL layer is implemented. This layer enables the FinOps agent to convert natural language queries from users into executable GraphQL queries. Instead of relying on fixed templates, the agent uses its large language model (LLM)-based reasoning to dynamically compose queries against the unified schema. This allows FinOps practitioners to ask complex questions in plain English, and the agent translates them into precise data retrieval operations.
A Multi-Agent System for Comprehensive Optimization
FinOps tasks are inherently complex, requiring coordination across multiple data sources, sophisticated analysis, and domain-specific reasoning. To handle this, the researchers implemented a multi-agent architecture comprising three specialized agents:
- Planning Agent: Interprets the user’s natural language query, formulates an execution plan, and orchestrates the workflow.
- Data Retrieval Agent: Executes data collection tasks by invoking appropriate GraphQL tools and interfacing with multiple vendor APIs.
- Analysis Agent: Synthesizes data from disparate sources, performs cross-system correlation, identifies optimization opportunities, and generates actionable recommendations.
This separation of concerns allows each agent to specialize in its domain, enhancing system flexibility and scalability.
Evaluating Performance
The FinOps agent was evaluated using several state-of-the-art language models, including proprietary models like GPT-4o and open-source models like Granite-3.1-8b. The evaluation framework measured various aspects of performance, including execution time, computational efficiency, planning accuracy, plan execution accuracy, data consolidation accuracy, recommendation accuracy, tool recognition latency, and task completion rate.
The results showed that models like GPT-4o and GPT-4o-mini demonstrated superior performance across most metrics, achieving high planning and data consolidation accuracy. They were also quick to recognize available tools, which proved to be a crucial early indicator of overall success. Interestingly, the study found that model size does not necessarily predict performance, suggesting that domain-specific training and architectural choices are more critical for specialized FinOps applications.
Also Read:
- OpsAgent: An Evolving AI System for Smarter Cloud Incident Diagnosis
- Autonomous AI Manages Home Energy with Natural Language
Conclusion
This research demonstrates the significant potential of autonomous FinOps agents in simulating real-life use cases for IT infrastructure and cost optimization. By combining advanced AI reasoning with a unified data access layer, these agents can understand complex requests, plan detailed steps, retrieve and consolidate data from various sources, and generate actionable recommendations, performing comparably to human FinOps practitioners. This work paves the way for more intelligent and efficient cloud financial management in the future.


