TLDR: Hat-DFed is a new decentralized federated learning framework designed for edge computing. It optimizes communication networks and model aggregation to improve AI model accuracy by 1.9% and reduce energy consumption by 36.7% compared to existing methods, especially in environments with diverse devices and data. It achieves this by dynamically constructing optimal communication topologies based on a ‘utility’ metric and using an importance-aware model aggregation mechanism.
In the rapidly evolving landscape of artificial intelligence, Federated Learning (FL) has emerged as a pivotal approach, allowing numerous edge devices to collaboratively train AI models without compromising data privacy. This method is particularly relevant in edge computing (EC) systems, where data is generated and processed closer to its source, reducing transmission costs to centralized clouds. However, traditional FL often relies on a central server, which can become a bottleneck as the number of devices and model complexity grow.
To overcome these limitations, Decentralized Federated Learning (DFL) has gained significant attention. DFL leverages peer-to-peer (P2P) communication, enabling devices to learn collaboratively without a central server. While DFL offers scalability, its iterative nature incurs substantial costs, heavily influenced by dynamic changes in communication networks (topology) and the inherent differences (heterogeneity) in edge device resources and data.
Addressing Key Challenges in DFL
The design of an effective DFL framework for EC systems faces two primary challenges:
-
System Heterogeneity: Edge devices possess diverse computational and communication capabilities. This disparity leads to imbalanced energy consumption, with less efficient devices expending more energy for identical tasks.
-
Data Heterogeneity: Training data on edge devices often reflects local conditions, resulting in non-uniform (non-IID) distributions. Additionally, unpredictable device connectivity can cause data volume and distribution to vary over time, degrading model performance and affecting energy costs.
Existing solutions have typically focused on either improving model performance or reducing energy consumption in isolation, failing to address both simultaneously. This gap highlights the need for a comprehensive framework that can optimize both aspects.
Introducing Hat-DFed: A Novel Solution
To this end, researchers Yuze Liu, Tiehua Zhang, Zhishu Shen, Libing Wu, Shiping Chen, and Jiong Jin propose Hat-DFed, a heterogeneity-aware and cost-effective decentralized federated learning framework. Hat-DFed aims to maximize model performance while minimizing cumulative energy consumption in complex edge environments. The core innovation lies in formulating topology construction as a dual optimization problem, which has been proven to be NP-hard, meaning it’s computationally very difficult to solve perfectly.
To tackle this complex problem, Hat-DFed employs a two-phase algorithm:
Phase I: Utility-based Topology Construction (UTC)
This phase dynamically builds optimal communication networks. It introduces a novel ‘utility’ metric that quantifies the combined impact of each network connection on both model performance improvement and energy cost. At the beginning of each training round, the system estimates the utility of potential communication links based on historical data. This unbiased estimation guides the selection of links to form the most efficient and effective communication topology for the current round. A tuning parameter balances the exploration of new links with the exploitation of known high-utility links.
Phase II: Decentralized Collaborative Model Update (DCMU)
Once the communication network is established, edge servers perform local training, exchange model parameters with their neighbors according to the new topology, and then aggregate these models. A crucial component here is the ‘importance-aware model aggregation’ mechanism. This mechanism addresses data heterogeneity by evaluating the ‘importance’ of models received from neighboring servers. Models that are more ‘important’ (e.g., those with higher loss on sampled data, indicating more to learn or better fit to diverse data) and those trained on larger datasets are given greater weight during aggregation. This prevents the biases caused by non-uniform data from degrading the overall model performance.
Also Read:
- Advancing Private AI: A New Framework for Neural Fields on Edge Devices
- Personalized Recommendations Meet Privacy: A New Approach in Federated Learning
Performance and Impact
Extensive experiments conducted on datasets like Fashion-MNIST and CIFAR-10 demonstrate that Hat-DFed significantly outperforms state-of-the-art baselines. On average, it achieves a 1.9% improvement in test accuracy while reducing total energy cost by 36.7% throughout the learning process. The framework shows particular strength in environments with more severe data heterogeneity.
The research also explored the impact of network sparsity, finding an optimal balance where too few connections hurt accuracy, and too many unnecessarily increase energy consumption. Hat-DFed’s scalability was also analyzed, showing acceptable performance even as the number of edge servers increases, despite the amplified data heterogeneity. An ablation study confirmed that both the utility-based topology construction and the importance-aware model aggregation modules are critical for Hat-DFed’s superior performance.
In conclusion, Hat-DFed offers a robust and efficient solution for deploying decentralized federated learning in heterogeneous edge computing environments. By intelligently optimizing communication topologies and model aggregation, it paves the way for more sustainable and high-performing AI applications at the edge. For more technical details, you can refer to the full research paper here.


