TLDR: This research benchmarks classical machine learning models for fault classification (FC) and fault localization (FL) in power system protection using realistic electromagnetic transient (EMT) data. It finds that FC can be solved with high accuracy (F1 up to 0.99) using models like MLP and Gradient Boosting, benefiting from longer data windows. FL is significantly more challenging, with only high-capacity models like MLP and ensemble methods achieving competitive performance (R2 around 0.8), and its accuracy is less sensitive to window length, suggesting limitations of purely raw signal data. The study highlights the distinct nature of these tasks and the trade-offs between accuracy and computational efficiency.
The modern power grid is undergoing a significant transformation with the increasing integration of renewable energy sources (RES) and distributed energy resources (DERs). While this shift promises a greener future, it also introduces complex challenges for traditional power system protection systems. These conventional systems, which rely on fixed thresholds, struggle to reliably identify and locate short circuits in an increasingly dynamic and intricate grid environment.
Machine learning (ML) offers a compelling alternative to these outdated methods. However, a comprehensive and systematic comparison of various ML models across different settings for critical tasks like fault classification (FC) and fault localization (FL) has been largely absent. This gap in research is precisely what a new study, titled “BENCHMARKING MACHINE LEARNING MODELS FOR FAULT CLASSIFICATION AND LOCALIZATION IN POWER SYSTEM PROTECTION,” aims to address.
Authored by Julian Oelhaf, Georg Kordowich, Changhun Kim, Paula Andrea P´erez-Toro, Christian Bergler, Andreas Maier, Johann J¨ager, and Siming Bayer, this pioneering work presents the first comparative benchmarking study of classical ML models for FC and FL in power system protection. The research utilizes electromagnetic transient (EMT) data, which provides highly accurate modeling of transient events like short circuits, crucial for realistic protection studies.
Methodology: Simulating Realistic Fault Scenarios
To ensure a robust evaluation, the researchers simulated a wide array of fault scenarios using the standard “Double Line” topology, a common benchmark in protection studies. These simulations were conducted in DIgSILENT PowerFactory, employing EMT analysis. Key grid parameters, such as line lengths, load conditions, and fault locations, were systematically varied to reflect typical operating conditions and enhance the models’ ability to generalize.
The dataset comprised 9023 simulation episodes, each lasting one second, with a nominal voltage of 90 kV and a sampling frequency of 6400 Hz. The preprocessing involved cropping each episode around the fault onset and applying a sliding window technique with lengths ranging from 10 ms to 50 ms. This approach allowed the evaluation of models under realistic real-time constraints, considering how different temporal contexts affect performance.
Understanding the Tasks: Fault Classification and Localization
The study focused on two primary tasks:
Fault Classification (FC): This was framed as a multi-class classification problem. Each segment of voltage and current waveforms was labeled as either “No Fault” or one of ten specific short-circuit types (e.g., single-phase to ground, two-phase, three-phase faults). The performance for FC was measured using the F1 score, which balances precision and recall across all classes.
Fault Localization (FL): This was treated as a regression task, where models predicted the fault location as a percentage of the line length. This normalized approach helps in generalizing across different grid topologies. The R2 score was the primary metric for assessing FL performance, indicating how well the model’s predictions match the actual fault locations.
A diverse set of classical ML models were benchmarked for both tasks, including linear methods (Logistic Regression, Ridge Regression), neighborhood and tree-based models (K-Nearest Neighbors, Decision Tree, Support Vector Classifier), ensemble methods (Random Forest, Gradient Boosting, Stacking Ensemble, Voting Ensemble), and the Multi-Layer Perceptron (MLP).
Key Findings: Classification Achieves High Accuracy, Localization Remains Challenging
The results for Fault Classification were highly promising. The Multi-Layer Perceptron (MLP) emerged as the top performer, achieving F1 scores up to 0.99. Gradient Boosting (GB) also showed excellent performance, matching the MLP at longer window lengths. These findings suggest that FC can be solved with near-perfect accuracy using raw voltage and current signals, especially with models like MLP and GB, and that longer temporal contexts generally improve accuracy.
However, Fault Localization proved to be a significantly more complex challenge. Among the tested models, only MLP, stacking, and voting ensembles achieved competitive R2 values, approaching 0.8. Their performance was largely insensitive to window length, indicating that simply providing more temporal context does not substantially improve localization. Simpler models struggled significantly, often producing near-zero or negative R2 values. This highlights that FL requires models with higher capacity and better feature extraction, and may benefit from incorporating additional grid parameters beyond raw voltage and current signals.
Also Read:
- Securing the Grid: How Transformer-Based AI Identifies Power System Anomalies
- Digital Twins in Industrial Maintenance: A Comprehensive Review of Predictive Strategies and Future Directions
Efficiency and Future Directions
Runtime efficiency was also a critical aspect of the evaluation. Linear models and Decision Trees were the fastest but less effective for FL. Tree ensembles like Gradient Boosting offered a good balance of speed and moderate accuracy. The most accurate models (MLP, stacking, voting) were slower but still feasible for near-real-time applications. This underscores the trade-off between accuracy and computational cost, especially for FL where a coarse localization might suffice for immediate protection decisions, with more precise analysis performed offline.
In conclusion, this benchmarking study provides valuable insights into the application of machine learning for power system protection. It clearly differentiates the complexities of fault classification and localization, demonstrating that while FC can be solved with high accuracy using models like MLP and GB, FL remains a more challenging task requiring advanced models and potentially the integration of grid knowledge. Future research will explore deep learning architectures, the inclusion of pre-fault information, and physics-informed approaches to further enhance the intelligence and resilience of protection systems. You can read the full paper here.


