TLDR: DAG-AFL is a new federated learning framework that uses a Directed Acyclic Graph (DAG) to enable efficient and accurate asynchronous training, especially for resource-limited devices. It addresses challenges like varying device capabilities and diverse data distributions by employing a smart “tip selection” algorithm based on data freshness, network reachability, and model accuracy. The system also includes a verification mechanism to ensure data integrity. Experiments show DAG-AFL significantly improves training efficiency and model accuracy compared to existing methods, while also offering better throughput and lower latency in blockchain environments.
Federated Learning (FL) is a powerful approach that allows machine learning models to be trained across multiple decentralized devices or servers holding local data samples, without exchanging the data itself. This is crucial for privacy, especially with regulations like GDPR. However, FL faces significant hurdles, particularly device asynchrony and data heterogeneity. Device asynchrony means that different devices have varying computational power, communication conditions, and battery levels, making it difficult to coordinate training. Data heterogeneity refers to differences in data distributions across devices, which can reduce efficiency and accuracy of the aggregated model.
Traditional blockchain-based FL methods have emerged as a promising solution for decentralized, scalable, and secure FL. However, many of these methods rely on consensus mechanisms similar to Proof of Work (PoW), which consume substantial resources and can hinder FL efficiency, especially for wireless and resource-limited devices. This is where the new research introduces an innovative framework.
Researchers have proposed a novel framework called Directed Acyclic Graph-based Asynchronous Federated Learning (DAG-AFL). This framework is specifically designed to address the challenges of asynchronous client participation and data heterogeneity in FL, while also limiting the additional resource overhead typically introduced by blockchain technology. DAG-AFL leverages the unique properties of Directed Acyclic Graphs (DAGs), which allow for parallel transaction processing, higher throughput, and reduced confirmation times compared to traditional linear blockchains.
The DAG-AFL framework defines two main roles: the Task Publisher and the Task Trainer. The Task Publisher initiates and oversees the FL process, providing an initial global model on the DAG and monitoring node status. Importantly, it doesn’t train models itself but focuses on real-time coordination. The Task Trainer, on the other hand, manages local training and helps maintain the DAG. It selects suitable “tips” (unapproved transactions) to retrieve metadata, aggregates models from other trainers, and updates its new training model on the DAG along with its metadata. This iterative process continues until the desired accuracy is achieved or a set number of iterations are completed.
A core innovation within DAG-AFL is its sophisticated tip selection algorithm. This algorithm considers three crucial factors to improve local model training accuracy and overall efficiency: temporal freshness, node reachability, and model accuracy. Temporal freshness prioritizes tips that are closer to the current time, as they generally reflect more accurate model states. Tip reachability indicates whether a tip has directly or indirectly integrated results from previous aggregation models, suggesting similar data distributions. To avoid converging to a local optimum, the algorithm balances the selection of reachable and unreachable tips. Finally, model accuracy is assessed efficiently by assigning a “feature signature” to each tip, allowing the system to quickly identify nodes with similar data distributions without resource-intensive accuracy verification of all tips.
To ensure the integrity and trustworthiness of the DAG, DAG-AFL incorporates a DAG verification strategy. This mechanism prevents the task publisher from tampering with the overall DAG structure. Trainers store specific validation paths and compare stored hash values with the actual data state to confirm authenticity. Each tip’s hash includes references to previous tips and the hash of its uploaded metadata, ensuring an ordered, traceable, and immutable record of model updates.
Extensive experiments were conducted on three benchmarking datasets (MNIST, CIFAR-10, and CIFAR-100) and compared against eight state-of-the-art approaches. The results demonstrate that DAG-AFL significantly improves training efficiency and model accuracy. On average, it boosts training efficiency by 22.7% and model accuracy by 6.5%. In terms of blockchain performance, DAG-AFL achieved superior throughput for uploading updated models and querying global models, along with lower latency, especially when compared to other blockchain-based FL systems like BlockFL and BFLC. This advantage stems from DAG-AFL’s approach of uploading only metadata, which significantly reduces communication overhead.
Also Read:
- Boosting AI Performance and Privacy at the Edge with Federated Layering
- Securing Graph Data: A New Framework for Privacy-Preserving Structure Learning
In conclusion, DAG-AFL presents a robust and efficient solution for asynchronous federated learning, effectively tackling challenges posed by device asynchrony and data heterogeneity. By integrating a DAG-based structure with an intelligent tip selection mechanism and a strong verification strategy, it offers a promising path for more scalable and secure decentralized machine learning, particularly for resource-constrained edge devices. For more technical details, you can refer to the full research paper: DAG-AFL: Directed Acyclic Graph-based Asynchronous Federated Learning.


