TLDR: The research paper introduces the “Sandbox Configurator,” a modular, open-source framework designed to enhance the technical assessment of AI systems within the EU’s AI Regulatory Sandboxes (AIRS). It addresses the fragmentation and lack of standardization in current AI assessment methods by enabling users to create customized, auditable sandbox environments. The framework supports both regulatory guidance (Core AIRS) and in-depth technical testing (Extended AIRS), providing features like visual pipelines, tailored dashboards, and automated reporting. Its plug-in architecture and deployment flexibility aim to foster a harmonized, interoperable ecosystem for trustworthy AI governance across Europe, integrating with existing EU digital and AI infrastructures.
The rapid integration of Artificial Intelligence (AI) systems into critical sectors like healthcare, finance, and public administration has brought about an urgent need for robust and systematic assessment. Recognizing this, the European Union’s Artificial Intelligence Act (AI Act) has introduced AI Regulatory Sandboxes (AIRS). These supervised environments are designed to allow AI systems to be tested under the watchful eye of Competent Authorities (CAs), aiming to strike a balance between fostering innovation, especially for startups and SMEs, and ensuring compliance with regulations.
However, the journey to effective AI assessment is not without its hurdles. Current assessment methods are often fragmented, testing procedures lack standardization, and the feedback loops between AI developers and regulators are frequently weak. To address these significant challenges, a new framework called the Sandbox Configurator has been proposed. This modular, open-source framework is designed to empower users to select relevant tests from a shared library and generate customized sandbox environments, complete with integrated dashboards for tracking progress.
What is the Sandbox Configurator?
At its core, the Sandbox Configurator is a flexible, open-source framework built with a plug-in architecture. This design allows for the integration of both open and proprietary modules, fostering a collaborative ecosystem of interoperable AI assessment services. The framework aims to serve multiple stakeholders:
- Competent Authorities (CAs): It provides structured workflows to help CAs apply legal obligations effectively.
- Technical Experts: It offers a platform to integrate robust evaluation methods into the assessment process.
- AI Providers (especially SMEs): It gives them a transparent pathway to compliance, facilitating experimentation and regulator-guided feedback.
- Civil Society: It promotes transparency, explainability, and auditability in AI systems.
The EU AI Act mandates that all Member States implement operational sandboxes by August 2026. These sandboxes are voluntary and must be offered free of charge for core activities, particularly to startups and SMEs, to democratize AI development and reduce compliance costs.
Core AIRS vs. Extended AIRS
The paper distinguishes between two modalities of engagement within the single AIRS framework:
- Core AIRS: This pathway focuses on legal and procedural oversight, primarily concerning risk classification, conformity pathways, and regulatory guidance. Here, the CA verifies the internal controls that AI providers have established to comply with the AI Act.
- Extended AIRS: Building upon the Core AIRS, this modality embeds structured technical testing and hands-on evaluation of aspects like robustness, accuracy, bias, and cybersecurity. This is where the Sandbox Configurator truly shines, enabling the creation of an AI Technical Sandbox (AITS) for in-depth technical assessments.
How the Sandbox Configurator Works
The Sandbox Configurator acts as a meta-orchestration layer. It formalizes sandbox configurations using a Domain-Specific Language (DSL), which serves as a contract between stakeholders. Regulators can encode compliance requirements, providers can express testing objectives, and experts can contribute domain-specific modules. This DSL allows for the dynamic assembly of modular building blocks into operational pipelines, ensuring alignment with both technical requirements and regulatory constraints.
Key features and capabilities of the Configurator include:
- Customizability and Modularity: It allows for ‘Ã la carte’ configuration of tests, metrics, and pipelines, tailoring each sandbox engagement precisely to the AI system’s needs.
- Compatibility with External Solutions: It can integrate with external catalogues of assessment tools, benchmarks, and datasets.
- Visual Pipelines and Tailored Dashboards: It offers low-code, drag-and-drop interfaces for composing and monitoring test pipelines, along with role-specific, real-time dashboards for different stakeholders.
- Automated Report Generation and Immutable Audit Trail: It can automatically generate machine- and human-readable technical reports and maintains a tamper-evident log of all data, code versions, parameters, and results, crucial for regulatory compliance and auditability.
- Deployment Portability: Sandbox instances can run seamlessly on-premises, in sovereign clouds, or across federated High Performance Computing (HPC) nodes, ensuring consistent results regardless of the underlying infrastructure.
Also Read:
- Navigating the AI Policy Maze: A New Framework for Computer Science Education
- Unpacking AI Evaluation: A New Approach with Measurement Trees
Towards Harmonized European Sandboxing
Beyond supporting individual Member States, the Sandbox Configurator holds significant potential for fostering long-term convergence and harmonization of AI sandboxing practices across Europe. By providing a shared technical backbone, it allows Member States to tailor sandbox deployments to their unique contexts while adhering to a common technical grammar. This facilitates the replication, comparison, and iterative improvement of practices across jurisdictions.
The framework is designed to integrate with existing European initiatives such as European Digital Innovation Hubs (EDIHs), Testing and Experimentation Facilities (TEFs), and AI Factories. These initiatives can provide local support, specialized technical expertise, and access to high-performance computing infrastructure for large-scale testing, further strengthening a federated network of interoperable sandboxes.
In conclusion, the Sandbox Configurator represents a crucial step towards operationalizing the EU AI Act. It provides a practical, modular, and open-source framework to bridge the gap between high-level regulatory obligations and executable assessments, ensuring that AI innovation proceeds hand-in-hand with trustworthiness and accountability across Europe. For more detailed information, you can refer to the full research paper.


