TLDR: ExeKGLib is a Python library with a graphical interface that simplifies machine learning (ML) pipeline creation for users with minimal ML expertise. It achieves this by using knowledge graphs to encode ML knowledge, enhancing transparency, reusability, and executability of workflows. Successfully evaluated at Bosch for industrial applications like welding quality monitoring, ExeKGLib bridges the knowledge gap between ML specialists and domain experts, making ML more accessible and fostering better communication.
In today’s fast-paced world, machine learning (ML) has become an indispensable tool across various industries. However, developing high-quality ML pipelines often demands extensive training, specialized ML expertise, and meticulous development at each step. This creates a significant barrier for domain experts in fields like science and engineering who could greatly benefit from ML-based analytics but lack the necessary background.
Addressing this challenge, researchers have introduced ExeKGLib, a Python library designed to democratize ML by enabling users with minimal ML knowledge to construct complex ML pipelines. Enhanced with a graphical interface, ExeKGLib simplifies the entire process by leveraging knowledge graphs (KGs) that encode ML knowledge in easily understandable terms for non-ML experts.
What is ExeKGLib?
ExeKGLib stands for Executable Knowledge Graphs Library. It’s a Python-based tool that simplifies the creation, validation, and execution of machine learning workflows. Unlike many existing tools, ExeKGLib uses Linked Open Data (LOD) formats like RDF and OWL to represent ML pipelines. This unique approach ensures transparency, reusability, and executability of the workflows. The library works in two main steps: first, it generates executable ML pipelines using KGs, and then it converts these KGs into Python scripts for execution.
The platform offers multiple interfaces for users, including a user-friendly graphical user interface (GUI), a simple coding interface, and a command-line interface (CLI). This flexibility caters to different user preferences and skill levels.
How ExeKGLib Simplifies ML Pipeline Development
At its core, ExeKGLib relies on a structured system of knowledge graphs. These KGs act as blueprints for ML pipelines, formalizing their description and making them transparent and reusable. The library supports a wide array of functionalities, including data visualization, data preprocessing, feature engineering, and ML modeling. It’s also designed to be easily extendable, allowing users to integrate new libraries or custom scripts.
A key feature is its semantic validation, which uses SHACL constraints to ensure that the constructed KGs are valid and executable. This minimizes user errors and provides immediate feedback during pipeline creation. For instance, it ensures that tasks are connected in a logical order and that data inputs and outputs are compatible.
Real-World Impact and Use Cases
ExeKGLib has demonstrated significant practical impact, particularly in industrial settings. It has been successfully evaluated at Bosch, where it serves as the backbone of their semantic machine learning solution, SemML. This solution has been applied to critical manufacturing applications such as resistance spot welding quality monitoring, process optimization for hot-staking, and plastic data analytics.
The library also plays a crucial role in European research projects like OntoCommons and Graph Massivizer. In OntoCommons, ExeKGLib contributes to standardizing ML practices and documentation in KGs, enhancing transparency and usability for non-ML experts like welding engineers. In Graph Massivizer, it facilitates the easy creation and storage of ML pipelines as KGs, promoting reuse and understanding among stakeholders from diverse disciplines.
One compelling use case at Bosch involved engineers creating and modifying ML pipelines to predict the quality of resistance spot welding. A user study with 28 experts, including ML, welding, and sensor engineers, showed that using ExeKGLib significantly increased task completion rates, improved correctness, and fostered better communication between practitioners. It also reduced the time needed to complete tasks and made previously impossible tasks feasible for non-ML experts.
Another use case highlighted how domain experts could leverage semantically annotated data for ML. In this scenario, ExeKGLib was part of a larger system that automated ML workflow creation and execution for welding quality estimation. The system enabled experts to annotate raw data with domain terms, select suitable ML pipelines, and visualize results, leading to improved communication and understanding between data scientists and domain experts.
Also Read:
- Unlocking AI’s Potential: A New Approach to Self-Evolving Agents
- The Essential Link: How Object-Centric Process Mining Powers AI in Organizations
Looking Ahead
ExeKGLib is an open-source initiative aimed at lowering the barrier to ML adoption and democratizing its use. While its current focus is on classic ML methods and tasks relevant to industrial environments, future plans include releasing a public GUI, expanding support for more feature engineering and classic ML methods, incorporating sophisticated neural networks, and integrating with graph-based databases for enhanced management and visualization of ExeKGs.
For more detailed information, you can refer to the research paper: ExeKGLib: A Platform for Machine Learning Analytics based on Knowledge Graphs.


