TLDR: The paper introduces a novel process discovery method that uses machine learning to identify rules distinguishing desirable from undesirable process behaviors. By grouping process executions based on these discriminative rules, the approach creates separate, interpretable process models. These models reveal the specific patterns leading to good or bad outcomes, offering deeper, outcome-aware insights into business processes than traditional single-model approaches.
Understanding and improving business processes is crucial for any organization. Event logs, which record every step and action within an information system, offer a rich source of data for this purpose. Traditionally, process discovery aims to create a single model that represents all observed behaviors. However, in the real world, not all process executions are equal; some are desirable (efficient, compliant), while others are undesirable (inefficient, involve rule violations, delays, or resource waste).
The challenge with a single, overarching process model is that it often fails to capture the critical differences between these good and bad outcomes. It might obscure why certain paths lead to success and others to failure, making it difficult to identify areas for improvement or to ensure compliance.
A New Approach to Process Discovery
A recent research paper, “Discriminative Rule Learning for Outcome-Guided Process Model Discovery,” by Ali Norouzifar and Wil van der Aalst, introduces a novel method to address this limitation. Their approach focuses on learning interpretable rules that can distinguish between desirable and undesirable process executions. By doing so, they can group traces (individual process instances) with similar outcome profiles and then apply process discovery separately within each group.
How It Works
The methodology involves several key steps:
1. Feature Encoding: First, process traces are encoded using declarative constraints. These are simple rules that capture relationships between activities, such as whether one activity always follows another, or if two activities must always occur together. This creates a structured feature space for each trace.
2. Ensemble Tree-Based Feature Extraction: To uncover more complex interactions, the method uses ensemble machine learning models, like random forests. These models help extract a richer set of decision rules from the initial features.
3. Regression Model for Importance: A sparse logistic regression model is then trained on this enhanced feature space. This model assigns importance scores to the extracted rules, identifying those that are most effective in distinguishing between desirable and undesirable outcomes.
4. Hierarchical Clustering and Rule Selection: Rules that are similar in how they represent traces are grouped using hierarchical clustering. From each cluster, the most representative rule (the one with the highest importance) is selected.
5. Event Log Projection and Process Model Discovery: Finally, the original event log is filtered to include only those traces that satisfy a chosen representative rule. A dedicated process model is then discovered for this filtered log. This results in focused and interpretable models that highlight the specific patterns driving either desirable or undesirable behaviors.
Benefits and Evaluation
This outcome-aware approach yields models that are much better suited for conformance checking and performance analysis. Instead of a generalized view, organizations gain focused insights into the drivers of both successful and problematic executions. For instance, one model might clearly show the efficient path for on-time deliveries, while another reveals the common patterns leading to delays or resource waste.
The researchers implemented their approach as a publicly available tool and evaluated it using multiple real-life event logs, including BPIC12, BPIC17, and Hospital Billing data. The evaluation demonstrated the framework’s effectiveness in isolating and visualizing critical process patterns, offering a more nuanced understanding of process behavior than traditional methods. For more technical details, you can read the full paper here.
Also Read:
- Decoding the Future: How AI Learns Events from Purely Numerical Time-Series
- Diagnosing AI’s Reasoning Abilities with TempoBench
Conclusion
By leveraging supervised learning to identify key differences between desirable and undesirable event logs, this method provides a powerful new perspective for process mining. It supports interpretable and outcome-aware process discovery, helping businesses not just see what happens, but understand why it happens, paving the way for more targeted improvements.


