TLDR: The OpenHands Software Agent SDK is a new toolkit designed to simplify the development and deployment of production-ready AI software engineering agents. It features a modular architecture with optional sandboxing, stateless design, strict separation of concerns, and two-layer composability. Unique capabilities include native sandboxed execution, model-agnostic multi-LLM routing, built-in security analysis with action confirmation, and interactive workspace interfaces. The SDK demonstrates strong performance on SWE-Bench Verified and GAIA benchmarks, providing a reliable foundation for both research and industrial-scale AI agent deployments.
Building advanced AI agents for software development has long been a complex endeavor, often requiring intricate implementations, robust security, and flexible user interfaces. A new research paper introduces the OpenHands Software Agent SDK, a comprehensive toolkit designed to simplify the creation and deployment of production-ready software engineering agents.
The OpenHands Software Agent SDK emerges from the lessons learned from the popular OpenHands framework, which boasts over 64,000 GitHub stars. The original monolithic architecture, referred to as OpenHands V0, faced challenges with rigid sandboxing, complex configurations, and tight coupling between research and production components. This led to a complete architectural overhaul, resulting in OpenHands V1 and its foundational SDK.
A New Foundation for AI Agents
The SDK is built on four core design principles aimed at addressing the limitations of its predecessor:
- Optional Isolation: Agents run locally by default, offering flexibility, but can easily switch to a secure, sandboxed environment when needed for safety or resource control.
- Stateless by Default, One Source of Truth for State: All components like agents, tools, and LLMs are immutable. The conversation state is the single mutable entity, ensuring reliable recovery and deterministic replay of agent sessions.
- Strict Separation of Concerns: The agent core is decoupled from applications (like CLI, Web UI, GitHub App), allowing them to integrate via SDK APIs as a shared library, promoting independent evolution.
- Two-Layer Composability: Developers can combine independent deployment packages (SDK, Tools, Workspace, Server) and safely extend the SDK by adding or replacing typed components.
Modular Architecture for Flexibility
The OpenHands Software Agent SDK is organized into four distinct Python packages, allowing for flexible composition based on deployment needs:
- openhands.sdk: Contains core abstractions such as Agent, Conversation, LLM, Tool, and the event system.
- openhands.tools: Provides concrete implementations of tools based on the SDK’s abstractions.
- openhands.workspace: Offers execution environments like Docker or hosted APIs, extending the SDK’s base classes.
- openhands.agent_server: A web server exposing REST and WebSocket APIs for remote execution.
This modularity enables independent testing, selective dependency management, and faster release cycles, which are crucial for production deployments.
Unique Features and Capabilities
Compared to existing SDKs from OpenAI, Claude, and Google, OpenHands introduces several unique and powerful features:
- Native Sandboxed Execution: Seamless local-to-remote execution portability with integrated REST/WebSocket services and secure containerized environments.
- Model-Agnostic Multi-LLM Routing: Supports over 100 LLM providers and allows agents to use different models for various requests, optimizing for cost or specific capabilities. It also supports non-function-calling models by converting tool schemas to text-based prompts.
- Built-in Security Analysis and Confirmation Policies: The SDK includes a SecurityAnalyzer that rates tool calls by risk (low, medium, high) and a ConfirmationPolicy that determines if user approval is required before execution. This adds a crucial layer of safety for agents performing potentially risky actions.
- Interactive Workspace Interfaces: Offers a browser-based VSCode IDE, VNC desktop, and a persistent Chromium browser for human inspection and control, transforming agents from black boxes into observable, interactive systems.
- Event-Sourced State Management: All interactions are treated as immutable events appended to a log, enabling deterministic replay and reliable recovery of agent sessions.
- Context Window Management: A Condenser system automatically summarizes conversation history to fit within LLM context limits, reducing API costs without degrading performance.
- Secrets Management with Auto-Masking: Automatically detects and masks sensitive credentials in logs and LLM context, preventing accidental exposure.
Also Read:
- Decoding Code Agent Decisions: An Analysis of Success and Failure Paths
- AgentGit: A New Approach to Reliable and Scalable AI Agents
Performance and Reliability
The OpenHands Software Agent SDK demonstrates strong performance on key benchmarks. It achieves a 72.8% resolution rate on SWE-Bench Verified, which measures an agent’s ability in software engineering tasks, and 67.9% accuracy on GAIA, a benchmark for general AI assistant capabilities, both using Claude Sonnet 4.5. These results validate the SDK’s architecture and its ability to maintain competitive performance across diverse model backends.
The SDK also employs a robust three-tier testing strategy, including programmatic tests, frequent LLM-based integration tests, and built-in academic benchmark evaluations, ensuring continuous quality assurance and reliability.
The OpenHands Software Agent SDK represents a significant step forward in making AI software engineering agents more practical and deployable, bridging the gap between rapid prototyping and production-scale deployments. For more in-depth information, you can read the full research paper here.


