TLDR: FedDCL is a novel framework for federated learning that allows a central server model to continuously learn from diverse client models without needing their private data. It addresses key challenges such as model heterogeneity, catastrophic forgetting, and knowledge misalignment by using pre-trained diffusion models to generate lightweight, class-specific prototypes. These prototypes enable data-free synthetic data generation for augmenting training, replaying past knowledge, and dynamically transferring knowledge from heterogeneous clients to the server. Experimental results show FedDCL significantly improves accuracy and reduces forgetting compared to existing methods, enhancing the practical applicability of federated learning in dynamic settings.
Federated Learning (FL) has emerged as a powerful approach for collaborative model training across various entities, all while ensuring the privacy of sensitive data by keeping it localized. However, as data continues to grow and models become increasingly diverse, traditional FL faces significant hurdles. These include inherent issues like data heterogeneity (where data distributions vary among clients), model heterogeneity (clients using different model architectures), and the problem of catastrophic forgetting (where models forget previously learned knowledge when learning new tasks). A new challenge, knowledge misalignment, also arises, particularly when relying on static public datasets for knowledge transfer.
Addressing these complex challenges, a novel framework called FedDCL has been introduced. FedDCL is designed to enable data-free continual learning for the server model within a federated setting where client models are diverse. The core innovation lies in leveraging pre-trained diffusion models to extract lightweight, class-specific prototypes. These prototypes offer a significant advantage by enabling three key data-free capabilities.
Firstly, these prototypes can generate synthetic data for the current task. This synthetic data augments training, helping to counteract non-Independent and Identically Distributed (non-IID) data distributions among clients. Secondly, they facilitate exemplar-free generative replay, which is crucial for retaining knowledge from previous tasks without needing to store any actual past data. This directly combats catastrophic forgetting. Thirdly, FedDCL enables data-free dynamic knowledge transfer from heterogeneous clients to the server, eliminating the reliance on static public datasets that often struggle to align with evolving task domains.
The FedDCL framework operates in three main phases. The first is **Federated Prototype Extraction**, where clients use a frozen pre-trained diffusion model to extract class prototypes. These prototypes are then aggregated by the server to form globally informed prototypes. This dynamic process ensures knowledge alignment with the current task while preserving previously learned concepts.
The second phase is **Augmented Local Continual Training**. During this stage, each client combines the synthetic data generated from the federated prototypes with its own private real data. The synthetic data serves a dual purpose: replaying knowledge from past tasks to prevent forgetting and augmenting current-task data to mitigate issues arising from data scarcity and non-IID biases. This leads to more robust and forgetting-resistant local model updates. The model’s classification head is also adaptively expanded to accommodate new classes introduced by incoming tasks.
The final phase is **Collaborative Distillation and Feedback**. Here, the server aggregates knowledge from the diverse client models and its own historical checkpoints. This aggregation is performed using synthetic datasets, which represent both current-task and historical-task knowledge. This data-free knowledge distillation allows the server to continuously accumulate knowledge. Simultaneously, this distilled knowledge is fed back to the clients, guiding their model refinement and ensuring alignment with the global knowledge. For more in-depth technical details, you can refer to the original research paper.
Experimental results, conducted on various datasets including Grayscale (combining MNIST, EMNIST, Fashion-MNIST) and RGB (from CIFAR-100), demonstrate the effectiveness of FedDCL. The framework consistently outperforms existing baselines across different settings, showcasing its potential to significantly enhance the generalizability and practical applicability of federated learning in dynamic and heterogeneous environments. For instance, on Grayscale datasets, FedDCL achieved up to 9.00 percentage points higher cumulative accuracy than the second-best method, while also drastically reducing forgetting. Similar improvements were observed on the more challenging RGB datasets, with accuracy gains of up to 24.37% over the closest baseline.
Also Read:
- ZeroDFL: A Decentralized Approach to Federated Learning for AI Models
- TAP: A Two-Stage Approach for Personalized Multi-Modal Federated Learning
FedDCL represents a significant step forward in federated continual learning, offering a robust and privacy-preserving solution for training server models in complex, real-world scenarios where data and models are constantly evolving.


