AI-Powered Design for Cloud Systems: LLMs and Simulators Optimize Distributed Architectures

TLDR: This research introduces an AI-driven method for designing policies in distributed cloud systems. It uses large language models (LLMs) to generate Python code for system policies, which are then evaluated by a domain-specific simulator. The simulator’s feedback helps the LLM refine its next policy generation in an iterative “generate-and-verify” loop. Using a Function-as-a-Service runtime (Bauplan) and its simulator (Eudoxia) as a case study, preliminary experiments show significant throughput improvements over traditional methods, highlighting a new approach to scalable cloud optimization.

Optimizing large-scale distributed cloud systems, like those powering our favorite online services, has long been a complex challenge. Traditionally, experts manually craft intricate rules and policies to manage resources, schedule tasks, and ensure efficiency. However, these hand-coded solutions are often difficult to scale, adapt to new scenarios, and generalize across different customer needs.

A new research paper, titled “AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators,” explores a groundbreaking approach to this problem. The paper, authored by Jacopo Tagliabue, proposes leveraging the rapidly advancing capabilities of Artificial Intelligence, specifically Large Language Models (LLMs), to automatically generate and evolve these critical system policies. Instead of humans painstakingly writing every rule, AI can now propose and refine them, opening up a vast new design space for optimization.

The core of this innovative methodology is an iterative “generate-and-verify” loop. Imagine an AI that acts like a highly skilled, tireless engineer. First, an LLM, which is excellent at understanding and generating code, proposes a Python-based policy for a specific system challenge, such as how to schedule tasks in a Function-as-a-Service (FaaS) environment. This generated code is then fed into a deterministic simulator, a digital twin of the real system. The simulator evaluates the AI’s proposed policy against standardized scenarios and workloads, measuring key performance indicators like system throughput and latency.

Crucially, the simulator doesn’t just run the policy; it provides structured feedback. If the policy has syntax errors, the AI learns about interface constraints. If it performs poorly, the AI receives insights into why and how to improve. This feedback loop allows the LLM to continuously refine its policy generations, learning from both successes and failures, much like a human engineer would iterate on a design. The beauty of this approach is that the generated policies are still human-readable Python code, maintaining interpretability while enabling AI-driven exploration of complex design spaces.

The researchers used Bauplan, a Function-as-a-Service runtime, and its open-source simulator, Eudoxia, as a practical case study. Bauplan’s architecture, which handles diverse data workloads from interactive queries to long-running batch pipelines, presents a perfect testbed for AI-driven optimization due to its inherent complexity and varied demands. Eudoxia, the simulator, provides a controlled environment to model function arrival, resource allocation, and execution, making it an ideal verifier for machine-generated policies.

Preliminary experiments demonstrated promising results. By running the discovery loop for 50 iterations with various frontier LLMs (including Sonnet, Opus, GPT5, and GPT5-mini), the researchers observed significant improvements in throughput over a baseline FIFO (First-In, First-Out) scheduling policy. GPT5, for instance, achieved a substantial 371.1% improvement, showcasing the potential of this AI-driven approach. The study also highlighted that different LLMs varied wildly in their ability to provide good policies, indicating an active area for further research and development.

Looking ahead, the research points to several exciting directions. Future work will focus on enhancing the robustness of the simulator to ensure its representativeness of real-world outcomes, improving the accuracy of policies through advanced prompt engineering and evolutionary computation, and extending the methodology to more general serverless systems. The paper also raises an intriguing question: could LLMs eventually help bootstrap new simulators themselves, further accelerating the scalability of this AI-driven design methodology?

Also Read:

This work represents a significant step towards a future where AI plays a fundamental role in co-designing and optimizing complex engineering systems. By combining the creative code-generation abilities of LLMs with the rigorous verification of simulators, we are entering a new era of scalable cloud optimization. You can read the full research paper here: AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

AI-Powered Design for Cloud Systems: LLMs and Simulators Optimize Distributed Architectures

Gen AI News and Updates

BMW’s GenAI Cloud Breakthrough: Redefining Operational Excellence and Cost Efficiency for Manufacturing Leaders

BMW Group Leverages Generative AI on AWS for Enhanced Cloud Cost Optimization

The Agentic Lakehouse: Enabling Safe AI-Driven Data Pipeline Management

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates