Enhancing LLM Teams with Adaptive Memory for Automated Workflows

TLDR: LEGOMem is a modular procedural memory framework designed for multi-agent Large Language Model (LLM) systems, particularly for workflow automation. It addresses the limitation of stateless LLM agents by decomposing past successful task executions into reusable memory units. These units are flexibly allocated to both central orchestrators (for high-level planning) and individual task agents (for execution guidance). Experiments on the OfficeBench benchmark demonstrate that LEGOMem significantly boosts task success rates across various LLM team configurations, enabling even smaller language models to perform more effectively by leveraging prior experiences for improved planning and tool use, ultimately leading to more efficient and reliable task execution.

Large Language Models (LLMs) are becoming increasingly vital for automating complex, multi-step workflows, especially in productivity environments like document editing, email management, and calendar scheduling. To handle the diverse and intricate nature of these tasks, many systems now use multi-agent designs, where several LLM-based agents work together, specialize, or delegate responsibilities. This approach mirrors the real world, which is inherently multi-agent, requiring coordinated decision-making and varied roles.

However, a significant limitation of current multi-agent systems is their stateless nature. Each task is typically solved from scratch, without learning from previous experiences. This absence of memory, particularly procedural memory—the knowledge of how to perform tasks—hinders their ability to improve over time. While some memory modules exist for single-agent LLMs, they don’t address the unique coordination and specialization challenges of multi-agent setups.

Introducing LEGOMem: A Memory Framework for LLM Teams

To bridge this gap, researchers have introduced LEGOMem, a modular procedural memory framework specifically designed for multi-agent LLM systems. LEGOMem focuses on a common architecture where a central orchestrator plans tasks and delegates subtasks to specialized tool-using agents. The goal is to equip both the orchestrator and individual task agents with memory derived from past successful task executions, leading to better planning, coordination, and task execution.

LEGOMem works by distilling successful task executions into structured memory units. These include ‘full-task memories’ (covering high-level plans and reasoning) and ‘subtask memories’ (detailing agent behavior and tool interactions). These modular memories are stored in a memory bank, indexed by semantic embeddings, and then reused when new tasks arise to enhance planning and execution.

How LEGOMem Operates

The framework operates in two main phases:

Offline Memory Construction: Successful task trajectories are analyzed and converted into reusable memory units. Full-task memories capture the overall task description and plan, while subtask memories encapsulate specific agent actions, tool use, and observations. These are stored in a vector database.
Online Memory-Augmented Inference: When a new task is presented, LEGOMem retrieves relevant memories. The orchestrator receives full-task memories to guide its planning and agent selection, while each task agent is given subtask memories relevant to its delegated responsibilities. This allows orchestrators to leverage past solutions for informed planning and error recovery, and task agents to improve their accuracy and efficiency in using tools.

LEGOMem also explores different memory retrieval strategies, including a ‘vanilla’ approach, ‘LEGOMem-Dynamic’ (which retrieves subtask memories dynamically during execution), and ‘LEGOMem-QueryRewrite’ (which uses an LLM to rewrite queries for more precise subtask memory retrieval before execution). These variants allow for a systematic study of how memory placement and retrieval affect multi-agent performance.

Also Read:

Key Findings and Impact

Evaluations on the OfficeBench benchmark, which includes multi-step office automation tasks, showed that LEGOMem variants consistently and significantly improved task success rates compared to systems without memory and other baseline methods. The framework was tested with various team configurations, including teams composed entirely of large LLMs, hybrid teams (LLM orchestrator, smaller LLM agents), and teams of only smaller LLMs.

A crucial finding was that orchestrator memory is vital for effective high-level planning and task decomposition. Fine-grained agent memory, on the other hand, significantly improves execution accuracy, especially for smaller language models. This means that even teams made up of less powerful language models can substantially benefit from procedural memory, narrowing the performance gap with stronger agents by using prior execution traces for more accurate planning and tool use. Furthermore, LEGOMem led to a reduction in the number of execution steps required and a lower rate of failed steps, indicating more efficient and reliable task completion.

In essence, LEGOMem serves as both a practical framework for building memory-augmented agent systems and a valuable research tool for understanding memory design in multi-agent workflow automation. For more in-depth information, you can refer to the original research paper.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Enhancing LLM Teams with Adaptive Memory for Automated Workflows

Introducing LEGOMem: A Memory Framework for LLM Teams

How LEGOMem Operates

Key Findings and Impact

Gen AI News and Updates

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

UNESCO’s 43rd General Conference Concludes with New Leadership and Landmark Ethics Frameworks for Technology

BRYGE AI Secures Silver Stevie® Award for Groundbreaking Health Tech Product for Women

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates