TLDR: ACON (Agent Context Optimization) is a novel framework that tackles the challenge of ever-growing context in long-horizon LLM agents. It optimizes the compression of interaction histories and environment observations into concise, informative summaries using a gradient-free, natural language guideline optimization process. This approach significantly reduces memory usage (26-54% peak tokens) and computational costs while maintaining or improving task performance, and enables smaller LLMs to act more effectively as agents, even after the compressor is distilled into a smaller model.
Large Language Models (LLMs) are increasingly becoming the brains behind AI agents, enabling them to tackle complex, real-world tasks. However, as these agents interact with dynamic environments, they accumulate vast amounts of information – a long history of actions and observations. This ever-growing ‘context’ presents a significant challenge: it drives up computational costs, slows down operations, and can even distract the LLM with irrelevant details, hindering its performance on long-running tasks.
Traditional context compression methods, often designed for simpler, single-step tasks or narrow applications, fall short when dealing with the intricate, multi-step nature of agentic workflows. Recognizing this critical bottleneck, researchers have introduced a novel framework called Agent Context Optimization (ACON).
What is ACON?
ACON is a unified framework designed to optimally compress both the environment observations an agent receives and its entire interaction history. The goal is to distill this information into concise yet highly informative summaries, ensuring that critical details for task success are retained while extraneous data is discarded.
Instead of relying on rigid rules or hand-crafted prompts, ACON employs a clever ‘compression guideline optimization’ process. Imagine an LLM learning to summarize more effectively by analyzing its own mistakes. ACON does precisely this: it compares scenarios where an agent succeeds with a full, uncompressed context against instances where it fails with a compressed one. By analyzing these ‘failure trajectories,’ a capable LLM identifies what crucial information was lost or distorted during compression. This feedback is then used to refine and update the compression guidelines in natural language, making the compressor smarter and more adaptive.
A significant advantage of ACON is its ‘gradient-free’ nature, meaning it doesn’t require complex parameter updates to the LLM itself. This makes it readily applicable to both open-source and proprietary, API-based LLMs.
Boosting Efficiency and Performance
The impact of ACON is substantial, as demonstrated across challenging multi-step agent benchmarks like AppWorld, OfficeBench, and Multi-objective QA. These benchmarks typically involve 15 or more interaction steps, pushing the limits of context management.
- Memory Reduction: ACON significantly reduces memory usage, cutting peak token counts by 26-54% while largely preserving the agent’s task performance. This means agents can operate more efficiently without sacrificing accuracy.
- Enhanced Smaller Models: Beyond just cost savings, optimized contexts actually improve decision quality. ACON has been shown to enhance smaller LLMs, allowing them to function more effectively as long-horizon agents, with performance improvements of up to 46%. This effectively acts as an ‘equalizer,’ enabling more compact models to approach the capabilities of their larger counterparts.
- Cost-Efficient Deployment: To further reduce the overhead of the compression module, ACON allows the optimized LLM compressor to be ‘distilled’ into smaller models. This process preserves over 95% of the teacher model’s accuracy, making the compression module itself more lightweight and deployable.
The framework supports two primary types of compression: history compression, which condenses the cumulative record of past actions and observations, and observation compression, which streamlines the latest information received from the environment.
Also Read:
- Unlocking Continuous Learning in AI Agents with ReasoningBank
- Enhancing LLM Agent Training with Principle-Based Process Rewards and Normalization
The Path Forward
While ACON marks a significant step towards more general, cost-effective, and deployable long-horizon LLM agents, the research acknowledges areas for future exploration. One limitation is the computational overhead introduced by the compressor module itself, which can sometimes increase total costs or break the efficiency of KV-cache mechanisms in transformers. Future work could explore KV-cache-level compression strategies to address this.
Nevertheless, ACON’s innovative approach to context management, leveraging natural language optimization and model distillation, lays a strong foundation for the next generation of intelligent, adaptive, and efficient AI agents. For more technical details, you can read the full research paper here.


