spot_img
HomeResearch & DevelopmentEnhancing Process Predictions with Object-Centric Graph Embeddings

Enhancing Process Predictions with Object-Centric Graph Embeddings

TLDR: This research introduces a novel end-to-end model combining Graph Attention Networks (GAT) and Long Short-Term Memory (LSTM) networks for predictive process monitoring. The model leverages Object-Centric Event Logs (OCELs) to predict future process behavior, specifically focusing on next activity prediction and next event time. By using GAT to embed activities and their relationships within an Object-Centric Directly-Follows Graph (OCDFG) and LSTM to handle temporal dependencies, the proposed approach outperforms traditional LSTM and ProcessTransformer models, especially in next activity prediction, demonstrating the value of object-centric graph embeddings.

Predictive Process Monitoring (PPM) is a crucial area within process mining that focuses on forecasting the future evolution of business processes, identifying potential deviations, and understanding variations. The goal is to leverage advanced machine learning and deep learning techniques to make accurate predictions based on event data. Traditionally, process mining models simplify event logs by associating each event with a single case, but this approach often falls short in complex real-world scenarios, leading to issues like convergence (one event linked to multiple cases) and divergence (one activity happening multiple times for a single case).

To address these limitations, Object-Centric Event Logs (OCEL) were introduced. OCELs offer a richer perspective by relating events to various types of objects, providing deeper insights into a process. This new data format has spurred advancements in various process mining techniques, including PPM. Recent research has explored how to utilize OCEL’s object-centric information to enhance predictions.

A new end-to-end model has been proposed that aims to predict future process behavior, specifically focusing on two key tasks: predicting the next activity in a sequence and forecasting the time until that next event occurs. This innovative model combines a Graph Attention Network (GAT) with a Long Short-Term Memory (LSTM) network.

The GAT component is designed to encode activities and their relationships within the process. It excels at capturing the spatial relationships between nodes in a graph, allowing it to model interactions between activities across multiple object types. This is particularly useful because it can dynamically weigh the importance of each neighboring node, focusing on the most informative event relationships. These graph embeddings provide crucial contextual information.

The LSTM network, on the other hand, is adept at handling temporal dependencies. LSTMs are a type of recurrent neural network that can retain important information over long periods, making them ideal for sequential data like event sequences in process monitoring. By combining GAT and LSTM, the model can simultaneously learn both the structural (graph-based) and temporal (sequence-based) characteristics of event logs.

The approach involves a two-step process: preprocessing and prediction. In preprocessing, the OCEL is ‘flattened’ for each object type, generating single object-type event logs. Temporal features (like time since last event, time since prefix start) and prediction targets (next activity, next event time) are extracted. Simultaneously, an Object-Centric Directly-Follows Graph (OCDFG) is constructed by merging individual Directly-Follows Graphs (DFGs) from each flattened log. This OCDFG captures connections between activities across all object types and serves as input for the GAT.

In the prediction step, the GAT generates embeddings for the activity nodes from the OCDFG. These embeddings are then matched and concatenated with the temporal features of the events. The combined features are fed into the LSTM network, which then makes predictions for the next activity and the time until the next event. This integrated approach allows for a smooth flow of information and updates across both network components.

The model, referred to as LSTM+GAT, was evaluated against existing prediction models, including a basic LSTM model using one-hot encoding and ProcessTransformer. The evaluation was conducted using four OCELs: a real-life event log from the BPIC17 dataset (loan applications) and three synthetic logs simulating order management, logistics, and Purchase-to-Pay processes.

For the ‘Next Activity Prediction’ task, the LSTM+GAT model consistently outperformed the basic LSTM model across all event logs and object types. This improvement highlights the benefit of incorporating object-centric information through the GAT model, which provides deeper insights for more accurate activity predictions. Compared to ProcessTransformer, the LSTM+GAT model and the basic LSTM generally performed better on most event logs, demonstrating the robustness of the LSTM architecture for this task.

For the ‘Next Event Time Prediction’ task, the LSTM+GAT model showed a slight improvement over the basic LSTM model. While the graph embeddings added valuable information about activity interrelations, the primary influence on prediction accuracy still came from the temporal features extracted from the event prefixes. The LSTM+GAT model outperformed other models on at least half of the tested event logs and object types, with the basic LSTM being the second best performer.

Also Read:

In conclusion, the research demonstrates the effectiveness of combining LSTM for sequential predictions with graph embeddings for representing activities via the OCDFG. The end-to-end nature of the model allows the GAT to refine activity embeddings based on prediction loss, leading to better representations of activities and their relationships. The superior performance of LSTM+GAT over a simple LSTM, particularly in next activity prediction, confirms the advantage of using object-centric graph embeddings over traditional one-hot encoding. Future work aims to explore more real-life OCELs, investigate unsupervised embeddings like Node2Vec, and incorporate time-related features directly into the OCDFG to further enhance predictions. For more details, you can refer to the full research paper: Predictive Process Monitoring Using Object-centric Graph Embeddings.

Karthik Mehta
Karthik Mehtahttps://blogs.edgentiq.com
Karthik Mehta is a data journalist known for his data-rich, insightful coverage of AI news and developments. Armed with a degree in Data Science from IIT Bombay and years of newsroom experience, Karthik merges storytelling with metrics to surface deeper narratives in AI-related events. His writing cuts through hype, revealing the real-world impact of Generative AI on industries, policy, and society. You can reach him out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -