TLDR: Google AI Research has introduced ReasoningBank, an innovative agent memory framework designed to enable Large Language Model (LLM) agents to learn and self-evolve at test time. This framework transforms an agent’s past interactions, including both successes and failures, into reusable, high-level reasoning strategies. By distilling experiences into compact, human-readable memory items, ReasoningBank significantly improves agent effectiveness and reduces interaction steps across various benchmarks, addressing the common challenge of LLM agents failing to accumulate and reuse experience.
Google AI Research has announced the development of ReasoningBank, a groundbreaking strategy-level agent memory framework aimed at enhancing the capabilities of Large Language Model (LLM) agents. This new framework allows LLM agents to learn from their own operational experiences, including both successful outcomes and failures, and to self-evolve during test time without requiring retraining.
The core innovation of ReasoningBank lies in its ability to convert an agent’s interaction traces into high-level, reusable reasoning strategies. Unlike conventional memory systems that often hoard raw logs or rigid workflows, which can be brittle and overlook valuable insights from failures, ReasoningBank reframes memory as compact, human-readable strategy items. These items are designed for easier transferability across different tasks and domains.
The operational process of ReasoningBank is structured around a simple yet effective loop: retrieve → inject → judge → distill → append. Each experience an agent undergoes is distilled into a memory item, complete with a title, a concise one-line description, and content detailing actionable principles such as heuristics, checks, and constraints. When faced with a new task, the system uses embedding-based retrieval to identify and inject the most relevant ‘top-k’ memory items as system guidance. Following execution, new insights are extracted and consolidated back into the ReasoningBank, perpetuating a continuous learning cycle.
When coupled with memory-aware test-time scaling (MaTTS), ReasoningBank demonstrates significant performance improvements. Empirical results show up to a +34.2% relative effectiveness gain and a –16% reduction in interaction steps across challenging web and software-engineering benchmarks. These figures represent a substantial advantage over previous memory designs that relied on storing raw trajectories or only successful workflows.
Also Read:
- Unlocking Continuous Learning in AI Agents with ReasoningBank
- Circle Research Introduces OOAK Framework for Enhanced AI Agent Security and Scalability in Web3
ReasoningBank is designed as a plug-in memory layer, making it compatible with interactive agents that already utilize ReAct-style decision loops or best-of-N test-time scaling. It serves to amplify existing verifiers and planners by injecting distilled lessons at the prompt or system level. For web-based tasks, it complements tools like BrowserGym, WebArena, and Mind2Web, while for software-engineering tasks, it layers atop SWE-Bench-Verified setups. This framework represents a significant step forward in enabling LLM agents to become more adaptive, efficient, and capable of continuous learning in complex, multi-step environments.


