TLDR: The CERTAIN project introduces a comprehensive framework to ensure AI systems in Europe are ethical, transparent, and compliant with regulations. It integrates semantic MLOps for structured AI lifecycle management, ontology-driven data lineage for traceability, and RegOps workflows to operationalize compliance. The framework aims to automate validation against EU laws like the AI Act and GDPR, fostering responsible AI innovation through pilot implementations in various sectors.
The rapid growth of Artificial Intelligence (AI) has made it a foundational technology, profoundly impacting Europe’s society and economy. However, this widespread adoption also brings significant ethical, legal, and regulatory challenges. The CERTAIN (Certification for Ethical and Regulatory Transparency in Artificial Intelligence) project is specifically designed to tackle these issues by developing a comprehensive framework that integrates regulatory compliance, ethical standards, and transparency into AI systems.
This initiative outlines a methodological approach for building the core components of this framework. It focuses on three key areas: semantic Machine Learning Operations (MLOps) for structured AI lifecycle management, ontology-driven data lineage tracking to ensure traceability and accountability, and regulatory operations (RegOps) workflows to operationalize compliance requirements. By implementing and validating its solutions across diverse pilot projects, CERTAIN aims to advance regulatory compliance and promote responsible AI innovation aligned with European standards.
Project Objectives and Components
The CERTAIN project seeks to create a cohesive and compliant ecosystem for AI stakeholders, involving 19 partners from both academia and industry across multiple European countries. Its main objectives include:
- Ensuring traceability and transparency of AI systems using advanced semantic technologies.
- Producing multidisciplinary legal, ethical, and social guidelines to support compliance.
- Designing and demonstrating tools for data space providers to ensure regulatory compliance and minimize energy consumption.
- Developing realistic test methods and synthetic data generation techniques to assess and improve AI system compliance.
- Creating templates for AI certification processes tailored to various application domains.
Two critical components underpin CERTAIN’s technical and regulatory compliance infrastructure: Semantic MLOps and Ontology Development, and Infrastructure and Compliance Mechanisms.
Semantic MLOps and Ontology Development
A Semantic MLOps Engine is designed to systematically capture detailed metadata across all stages of the ML lifecycle, including data preprocessing, feature engineering, model training, evaluation, and deployment. This engine also tracks data governance, lineage, and explainability artifacts necessary for regulatory compliance, reproducibility, and continuous verification against certification standards. It even collects metadata on resource utilization, such as energy consumption, to monitor environmental impacts and promote digital sustainability.
Functioning as the “brain” of the system, the Semantic MLOps Engine incorporates legal, ethical, and societal guidelines. It orchestrates end-to-end workflows and integrates with various components to form a certifiable ecosystem. Artifacts are continuously tracked and transformed to be queryable by the RegOps Engine, which validates them against certification criteria and enables the generation of certification reports.
To systematically capture, trace, and semantically describe the stages of the AI lifecycle, the project adopts a principled ontology engineering methodology. These ontologies are driven by competency questions derived from legal obligations, such as the EU AI Act, technical standards, and real-world AI engineering practices. They build upon and extend established vocabularies to enable interoperable descriptions of lifecycle elements like data sourcing, model development, and evaluation metrics.
Infrastructure and Compliance Mechanisms
Data spaces are emerging as crucial infrastructures within the EU Data Strategy. Within CERTAIN, these data spaces function as secure, interoperable ecosystems that embed regulatory compliance directly into their architecture. A Data Lineage Connector links the Semantic MLOps infrastructure to these data spaces, leveraging semantic web technologies and Knowledge Graphs. This enables detailed provenance tracking and traceability of data throughout the AI lifecycle, supporting auditability, accountability, and transparency for all stakeholders.
To ensure AI systems meet regulatory requirements, CERTAIN is developing compliance assessment models that automate validation against key criteria such as fairness, robustness, and transparency. These tools generate synthetic test data to identify compliance risks, producing detailed reports, alerts, and remediation guidance. This automation reduces manual effort, increases consistency, and accelerates certification readiness, contributing to a trustworthy ecosystem aligned with regulations.
The project addresses horizontal EU legislation like the General Data Protection Regulation (GDPR) and the EU AI Act, as well as vertical, sector-specific regulations (e.g., PSD2 and MiFID II). It systematically assesses legal requirements and maps them to technical specifications, enabling the development of tailored software components such as data anonymization tools for GDPR compliance, modules for bias detection and Explainable AI for the AI Act, and sector-specific compliance tools.
Also Read:
- The Sandbox Configurator: A Framework for Standardized AI Assessment in Regulatory Environments
- Navigating the AI Policy Maze: A New Framework for Computer Science Education
Current Progress and Future Steps
The CERTAIN project has made significant progress in laying the groundwork for regulatory-compliant AI development. Initial ontology drafts aligned with EU regulations have been created, focusing on interoperability and reuse. Prototype components of the Semantic MLOps Engine and early RegOps workflows are also under development, designed with pilot integration in mind.
Key challenges include ensuring semantic interoperability across heterogeneous systems and balancing formal rigor with practical usability in ontology design. The next steps involve integrating these components into seven pilot domains, including healthcare, biometrics, energy, finance, and IT, to test compliance tools and validate certification procedures. Efforts will also focus on scaling RegOps services across domains, refining orchestration mechanisms, and delivering modular, reusable compliance pipelines to support a certifiable AI ecosystem aligned with EU regulations.
For more in-depth information, you can refer to the original research paper: Towards a Framework for Supporting the Ethical and Regulatory Certification of AI Systems.


