TLDR: ResearStudio is an open-source framework that introduces real-time human intervention into deep-research agents, moving beyond the traditional “fire-and-forget” model. It features a “Collaborative Workshop” design with a transparent Planner–Executor architecture, allowing users to pause, edit plans or code, and resume execution at any point. This enables seamless switching between AI-led and human-led workflows. The framework achieves state-of-the-art performance on the GAIA benchmark in autonomous mode, demonstrating that human control can coexist with high capability. ResearStudio also incorporates robust safety measures through sandboxing and interactive oversight, aiming to make AI research agents more controllable, reliable, and trustworthy.
A new open-source framework called ResearStudio has been introduced, aiming to transform how deep-research agents operate by integrating real-time human control. Traditionally, these advanced AI systems, once initiated, run in a “fire-and-forget” mode, offering no avenues for users to correct errors or inject expert knowledge during their execution. ResearStudio addresses this critical gap by placing human intervention at its core, allowing for a more collaborative and trustworthy AI research process.
The framework is built on a “Collaborative Workshop” design, featuring a hierarchical Planner–Executor architecture. This system meticulously records every step into a live “plan-as-document” and uses a rapid communication layer to stream every action, file change, and tool call to a web interface. This transparency means users can monitor the agent’s progress in real-time.
What sets ResearStudio apart is its dynamic control capabilities. At any point, a user can pause the agent’s operation, modify the plan or code, execute custom commands, and then resume the process. This flexibility enables seamless transitions between AI-led, human-assisted, and human-led, AI-assisted modes. This means that while the AI can drive the workflow, humans can audit, refine, and contribute their domain expertise. Conversely, users can orchestrate high-level strategies and delegate specific subtasks to the AI.
ResearStudio’s architecture is composed of three layers. The L-2 Agent Core, comprising a Planner and an Executor, processes user requests. The Executor utilizes tools from the L-1 MCP Toolbox, which includes functionalities for document processing, searching, and code execution. The entire process is made accessible to the user through the L-3 WebPage, connected by a central Communication Protocol. The system deliberately omits full browser automation by default to maintain speed and data integrity, though it can be enabled if needed.
The framework’s collaborative features are powered by a dual-layered communication system. A Model-Context Protocol (MCP) standardizes the Executor’s tool calls into reliable, JSON-based functions. More importantly, an event-driven protocol facilitates human-agent partnership, allowing user actions to be translated into specific API calls. This enables workflows like “Change Files” and “Pause/Resume,” transforming the system into a truly interactive and auditable workshop.
Despite its focus on human collaboration, ResearStudio also demonstrates impressive autonomous performance. In evaluations on the GAIA benchmark, a standard for testing general-purpose agentic systems, ResearStudio achieved state-of-the-art results. It surpassed other leading systems, including OpenAI’s DeepResearch, in fully autonomous mode, proving that fine-grained human control does not compromise automated capability. For instance, on the GAIA validation set, it achieved an average score of 70.91%, outperforming A-World and OpenAI-DeepResearch. On the GAIA test set, it further solidified its lead with an overall average score of 74.09%.
The user interface of ResearStudio is designed for complete situational awareness, featuring multi-panel displays for conversation logs, file explorers, and editors. This allows users to monitor execution steps, inspect generated files, and intervene directly if they observe flawed code or incorrect strategies. This capability transforms potential catastrophic failures into manageable, correctable mistakes, enhancing the trustworthiness and reliability of AI agents.
Safety is a paramount concern, addressed through architectural design and interactive oversight. Each task operates within a unique, sandboxed workspace, isolating it from the host system and other tasks to prevent data exfiltration or unauthorized access. Tool operations are mediated through an abstraction layer, preventing direct system calls and limiting command-line capabilities. Additionally, the “Plan-as-Document” principle provides a critical checkpoint for users to review proposed actions and intervene against unsafe strategies or prompt injections. User prompts are also evaluated against safety policies to filter disallowed content.
Also Read:
- AI Agents Reshape Scientific Discovery: A New Paradigm for Research
- RA-Gen: A New Framework for Secure and Controllable Code Generation
While ResearStudio marks a significant step forward, the authors acknowledge limitations, including the reliance on user domain expertise for effective collaboration and the cognitive demand of continuous monitoring. Future work aims to develop semi-autonomous intervention mechanisms, conduct formal Human-Computer Interaction studies, and rigorously stress-test safety measures against adversarial attacks. The full code, protocol, and evaluation scripts are available at https://github.com/ResearAI/ResearStudio, encouraging further development in safe and controllable research agents.


