TLDR: SSFL-DCSL is a new semi-supervised federated learning framework for intelligent fault diagnosis. It addresses data and label scarcity in distributed industrial settings by using a confidence-based pseudo-label weighting, dual contrastive learning (local and global), and privacy-preserving prototype aggregation. Experiments show it significantly improves accuracy, especially with limited labeled data, and is communication-efficient and robust to client dropouts.
In the world of industrial machinery, ensuring safe operation and boosting production efficiency heavily relies on intelligent fault diagnosis (IFD). However, traditional deep learning methods for IFD face significant hurdles. They demand vast amounts of labeled training data, which is often scattered across different locations, creating “data islands.” The process of labeling this data is also expensive and time-consuming, making it difficult to acquire enough labeled samples. Furthermore, variations in data distribution among different machines or factories can hinder a model’s performance.
To tackle these pressing challenges, a new framework called SSFL-DCSL (Semi-Supervised Federated Learning via Dual Contrastive Learning and Soft Labeling) has been proposed. This innovative approach integrates several advanced techniques to address data and label scarcity for distributed industrial clients, all while maintaining user privacy.
How SSFL-DCSL Works
The SSFL-DCSL framework operates in a semi-supervised federated learning environment. This means it can learn effectively from both a small amount of labeled data and a large amount of unlabeled data, distributed across various clients (e.g., different factories or machines). Instead of sending raw, sensitive data to a central server, only abstract representations are shared, ensuring privacy.
One of the core innovations is a sample weighting function based on the Laplace distribution. During training, the model generates “pseudo labels” for unlabeled data. However, some of these pseudo labels might not be very accurate. This function helps to reduce bias by assigning lower weights to pseudo labels that the model is less confident about, allowing the system to focus more on reliable information.
Another key element is the dual contrastive loss. This consists of two parts:
- Local Contrastive Loss (LCL): This component helps each client’s model learn robust and meaningful representations from its own unlabeled data. It does this by comparing different augmented versions of the same data sample, ensuring consistency and enhancing the model’s ability to distinguish between different fault types. It also intelligently selects positive and negative data pairs and uses a dynamic temperature to stabilize the learning process.
- Global Contrastive Loss (GCL): This part facilitates knowledge sharing among different clients. It aligns the features learned locally by each client with a set of “global prototypes” maintained on the server. These prototypes are essentially average feature vectors for each fault category, representing collective knowledge. By aligning with these global prototypes, local models can learn from the diverse data distributions across all clients, improving their overall generalization capabilities.
To manage and share this collective knowledge, the framework uses a prototype aggregation method (PTA). Instead of exchanging entire model parameters (which can be large and privacy-sensitive), clients only send their compact local prototypes to a central server. The server then aggregates these local prototypes into global prototypes using weighted averaging and updates them with momentum, ensuring stability. This significantly reduces communication overhead, making the system highly efficient for industrial environments with limited bandwidth.
Also Read:
- Navigating the Path to Trustworthy Federated Learning: A Comprehensive Overview
- Enhancing AI Training: A New Approach to Over-the-Air Federated Distillation
Experimental Validation and Benefits
The SSFL-DCSL framework was rigorously tested on several publicly available datasets, including those from Purdue University (PU), Mechanical Fault Prognosis Technology (MFPT), Case Western Reserve University (CWRU), and a real-world chemical plant (CP) dataset. The results were impressive: SSFL-DCSL consistently outperformed existing state-of-the-art methods, especially in scenarios where only a small percentage of data was labeled (e.g., 10%). In fact, with just 10% labeled data, SSFL-DCSL achieved accuracy comparable to or even better than supervised methods using 20% labeled data.
The framework also demonstrated remarkable computational efficiency. It achieved high diagnostic accuracy with moderate training and testing times per sample. Crucially, it reduced data transmission by over 99% compared to methods that exchange full model parameters, making it highly suitable for real-world industrial deployments where network bandwidth might be a constraint.
Furthermore, SSFL-DCSL proved to be highly resilient to client dropouts or temporary communication losses. Even when some clients failed to contribute their prototypes, the system maintained strong performance, thanks to its robust aggregation and pseudo-labeling strategies.
This research marks a significant step forward in making intelligent fault diagnosis more practical and accessible for modern factories, especially where data is scarce, distributed, and privacy is paramount. For more technical details, you can refer to the full research paper here.


