spot_img
HomeResearch & DevelopmentBoosting Predictive Process Monitoring with Domain-Adapted LLMs

Boosting Predictive Process Monitoring with Domain-Adapted LLMs

TLDR: This research explores directly adapting Large Language Models (LLMs) to process data for Predictive Process Monitoring (PPM) tasks like next activity and remaining time prediction. Instead of relying on natural language reformulation or prompt engineering, the study uses parameter-efficient fine-tuning (PEFT) techniques. The results show that these domain-adapted LLMs can outperform traditional recurrent neural networks and narrative-style LLM approaches, especially in multi-task settings, with faster convergence and reduced hyperparameter optimization, demonstrating their potential for interpreting sequential process information.

Large Language Models (LLMs) have rapidly become a cornerstone in various research fields, including Process Mining (PM). Traditionally, their application in PM has revolved around prompt engineering or transforming event logs into narrative-style datasets, leveraging the LLMs’ inherent semantic understanding. However, a recent study introduces a novel approach: directly adapting pretrained LLMs to process data without the need for natural language reformulation. This method, driven by the LLMs’ proficiency in generating token sequences—a task akin to objectives in PM—aims to unlock their full potential in this domain.

The research, titled Domain Adaptation of LLMs for Process Data, focuses on parameter-efficient fine-tuning (PEFT) techniques. This strategy is crucial for mitigating the substantial computational overhead typically associated with large models, making their adaptation more practical and accessible. The experimental setup specifically targets Predictive Process Monitoring (PPM), a branch of PM concerned with forecasting future process states and case behaviors. The study investigates both single-task and multi-task predictions, such as predicting the next activity (NA) or the remaining time (RT) in a process.

The findings are compelling: the fine-tuned LLMs demonstrate a potential improvement in predictive performance compared to state-of-the-art recurrent neural network (RNN) approaches and even recent narrative-style solutions, particularly excelling in multi-task settings. Beyond performance, these adapted models exhibit faster convergence during training and require significantly less hyperparameter optimization, simplifying their deployment and maintenance.

Why Direct Adaptation?

Current methods often treat event logs as plain text, relying on the LLMs’ general language skills. This overlooks the structured, domain-specific nature of process data. Event logs are not linguistic artifacts; they adhere to a smaller alphabet of activity labels with a distinct syntax governed by behavioral relations. The authors argue that relying solely on semantic meaning, as captured by LLMs trained on natural text, is insufficient for modeling complex process behavior. Direct adaptation, through retraining embedding layers and fine-tuning specific weights, allows the LLM to learn from process data in its native format, bypassing the need for natural language conversion and evaluating its intrinsic capability to interpret sequential process information.

Furthermore, prompt engineering, while powerful, demands expert knowledge and is prone to errors and model sensitivity. Small changes in phrasing can lead to drastically different outcomes, highlighting the fragility of such approaches. The systematic fine-tuning approach presented in this paper offers a more robust and consistent alternative.

Methodology at a Glance

The proposed methodology employs various PEFT strategies across different LLMs, comparing their effectiveness against RNN-based and prompt-based solutions. The framework comprises four main components: input layers, backbone, output layers, and the PEFT of these components. Input layers convert raw event features into a common vector space. The backbone, typically a transformer model like GPT-2, Qwen2, or Llama3.2, transforms this representation. Output layers map the backbone’s outputs to task-specific predictions (e.g., NA or RT). PEFT involves training only a small subset of parameters, such as new input/output layers, while the main backbone is either frozen, partially frozen, or enhanced with adapter layers like Low-Rank Adaptation (LoRA).

Key Experimental Insights

The experiments utilized five real-world event logs, including BPI12, BPI17, and three versions of BPI20, chosen for their diversity. The results highlighted several critical points:

  • Multi-task RNNs consistently underperformed, and narrative-style solutions (S-NAP) were significantly outperformed across all datasets, suggesting that semantic capabilities alone are insufficient for learning complex process behaviors.
  • While recurrent networks use fewer parameters and less runtime, they demand extensive hyperparameter optimization. LLMs, especially with LoRA, required less tuning and generally outperformed RNNs and S-NAP in both single- and multi-task setups.
  • Among the fine-tuned LLMs, Llama and Qwen demonstrated remarkable stability across datasets, while PM-GPT2 showed strong performance on specific datasets but lacked overall consistency.
  • LLMs and single-task RNNs converged faster than multi-task RNNs for NA prediction, with LLMs often needing fewer than five epochs. For RT prediction, LLMs consistently outperformed both single- and multi-task RNNs.
  • LoRA proved particularly effective for RT prediction, outperforming freezing configurations. This suggests that LLMs, originally trained as classifiers, benefit significantly from adapter layers when tackling regression tasks. For NA prediction, fine-tuning a few layers or using LoRA generally yielded better results than fully freezing the model.

Also Read:

Future Directions

While this work marks a significant step forward, limitations remain. PEFT, though cost-effective, still involves more trainable parameters than RNNs. Future research could explore quantization techniques to further reduce model size. Additionally, adapting LLMs for other complex PM tasks like process discovery and anomaly detection, which don’t align with standard training formats, presents ongoing challenges. The study also notes that LoRA was used with default settings, implying potential for further optimization to reduce memory usage.

In conclusion, this study provides a systematic evaluation of fine-tuning methods for adapting LLMs to predictive process monitoring. By moving beyond prompt engineering and narrative reformulations, the research demonstrates that explicitly adapted LLMs can outperform traditional PPM models and narrative-style approaches in both single- and multi-task next activity and remaining time prediction, paving the way for more robust and efficient process analysis.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -