TLDR: BikeV AE-GNN is a novel AI framework that combines a Hybrid Graph Neural Network with a Variational Autoencoder to accurately estimate Average Daily Bicycle counts and classify traffic levels. It addresses the challenge of extremely sparse bicycle count data in urban networks by generating synthetic data and effectively modeling spatial relationships. Demonstrated in Melbourne, the model significantly outperforms existing methods, offering crucial insights for urban and transport planning.
Accurate estimation of bicycle volumes on urban road segments is crucial for effective urban and transport planning, helping to reduce traffic congestion, lower carbon emissions, and improve public health. However, this task faces a significant challenge: extremely sparse count data in urban bicycling networks worldwide. Many cities, including Melbourne, Australia, have less than 1% of their road segments with recorded bicycle counts, leading to about 99% data sparsity.
Traditional methods, such as statistical and machine learning models, struggle in such data-scarce environments because they rely on abundant labeled data and often fail to capture the complex spatial relationships within transportation networks. Even recent advancements in Graph Neural Networks (GNNs), which are effective for motorized traffic prediction by exploiting road network connectivity, are limited by this extreme data sparsity when applied to bicycle networks.
Introducing BikeV AE-GNN
To address this critical gap, researchers have developed BikeV AE-GNN, a novel dual-task framework that augments a Hybrid Graph Neural Network (GNN) with a Variational Autoencoder (V AE). This innovative approach aims to accurately estimate Average Daily Bicycle (ADB) counts and categorize bicycling traffic levels in extremely sparse bicycle networks.
The BikeV AE-GNN framework tackles data sparsity through two main components:
- Hybrid-GNN: This component integrates three different GNN variants—Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), and GraphSAGE. GCN captures local neighborhood patterns, GAT assigns adaptive weights to neighbors based on their relevance, and GraphSAGE enables inductive learning through neighborhood sampling. By combining these, the Hybrid-GNN effectively models intricate spatial relationships and multi-scale patterns in sparse networks.
- V AE-based Data Augmentation: The Variational Autoencoder generates synthetic nodes and edges, enriching the graph structure. This process involves learning the distribution of existing node features to create new, realistic data points. These synthetic data points are then integrated into the training set, significantly increasing the effective training size and enhancing the model’s ability to generalize and perform robustly despite the initial sparsity.
BikeV AE-GNN performs two tasks simultaneously: regression for estimating continuous bicycling volumes and classification for categorizing bicycling traffic levels (e.g., Very Low, Low, Medium, High, Very High Traffic).
Real-World Application and Performance
The effectiveness of BikeV AE-GNN was demonstrated using OpenStreetMap data and publicly available bicycle count data within the City of Melbourne. This dataset is particularly challenging, with only 141 out of 15,933 road segments having labeled counts, highlighting the severe data sparsity.
The experiments showed that BikeV AE-GNN significantly outperforms traditional machine learning models and baseline GNN models. Specifically, the HybridParallelGNN + V AE configuration achieved a mean absolute error (MAE) of 30.82 bicycles per day, an accuracy of 99%, and an F1-score of 0.99. This represents a substantial improvement, reducing MAE by 66.7% compared to the best machine learning baseline (Random Forest) and by 70.7% compared to the best GNN baseline (GraphSAGE).
Ablation studies further validated the crucial role of both the Hybrid-GNN and V AE components. Removing the V AE augmentation alone led to a significant degradation in performance, increasing the MAE by 46.3%, underscoring its importance in generating robust data representations for sparse networks. The contributions of GAT and GraphSAGE within the hybrid architecture were also confirmed, showing their complementary roles in feature aggregation.
Also Read:
- Enhancing Traffic Insights: Inferring Lane-Level Data from Road Information
- Optimizing Urban Mobility: A New Multi-Agent Reinforcement Learning Approach for Resource Allocation
Implications for Urban Planning
BikeV AE-GNN offers significant implications for urban mobility planning. By enabling accurate prediction of bicycle volumes and traffic levels, it can inform infrastructure optimization, improve cyclist safety policies, and support the development of more sustainable bicycling infrastructures. This research advances bicycling volume estimation in sparse networks using novel and state-of-the-art approaches.
For more detailed information, you can refer to the full research paper here.


