TLDR: This research introduces two AI models, Adaptive Navigation (AN) and Hierarchical Hub-based Adaptive Navigation (HHAN), to combat urban traffic congestion. AN uses decentralized AI agents at every intersection with Graph Attention Networks for local coordination. HHAN scales this to large cities by placing agents only at key “hubs” and coordinating them with an Attentive Q-Mixing framework. Both models significantly reduce travel times and maintain 100% routing success, offering a scalable, coordinated, and congestion-aware solution for intelligent transportation systems.
Urban traffic congestion is a persistent challenge, leading to longer travel times, increased pollution, and driver frustration. Traditional navigation systems, often relying on the Shortest Path First (SPF) algorithm, are effective for single vehicles in static conditions. However, in dynamic, multi-vehicle environments, SPF can worsen congestion by directing many vehicles onto the same “shortest” path, quickly overwhelming road capacity.
Researchers at York University, Fazel Arasteh, Arian Haghparast, and Manos Papagelis, have developed a novel approach to tackle this problem using multi-agent reinforcement learning (MARL). Their work, detailed in the paper Network-Constrained Policy Optimization for Adaptive Multi-agent Vehicle Routing, introduces two models: Adaptive Navigation (AN) and Hierarchical Hub-based Adaptive Navigation (HHAN), designed to create coordinated, network-aware fleet navigation.
Adaptive Navigation: Decentralized Intelligence at Every Intersection
The first model, Adaptive Navigation (AN), proposes a decentralized system where each intersection in a road network is assigned an intelligent agent. When a vehicle approaches an intersection, its agent provides routing guidance based on two key pieces of information: the vehicle’s final destination and the current local traffic conditions. To achieve coordination among these individual intersection agents, AN utilizes Graph Attention Networks (GAT). These networks allow agents to share and process information about their immediate neighborhood’s traffic state, enabling them to make more informed decisions that implicitly coordinate with nearby agents.
This approach moves beyond simply finding the shortest path; instead, agents learn to distribute traffic more intelligently across the network, anticipating and preventing congestion before it occurs. Experiments on synthetic grids and a realistic abstracted map of Downtown Toronto showed that AN significantly reduced average travel time compared to traditional SPF and other learning baselines, all while maintaining a 100% success rate for vehicles reaching their destinations.
Hierarchical Hub-based Adaptive Navigation: Scaling Up for Large Cities
While AN is effective for smaller to medium-sized networks, deploying an agent at every intersection in a vast city like Manhattan would be computationally impractical. To address this scalability challenge, the researchers introduced Hierarchical Hub-based Adaptive Navigation (HHAN). This extension of AN strategically places agents only at a select subset of critical intersections, known as “hubs.”
In HHAN, a vehicle’s journey is broken down into segments between these hubs. Hub agents control the high-level, hub-to-hub routing, while the traditional SPF algorithm handles the micro-routing within each hub’s local region. For effective coordination among these hub agents, HHAN employs a sophisticated framework called Attentive Q-Mixing (A-QMIX). This system allows for centralized training, where agents learn from a global perspective, but maintains decentralized execution, meaning agents can still make decisions independently in real-time. A-QMIX is particularly innovative in its ability to aggregate asynchronous vehicle decisions using an attention mechanism, ensuring that important routing choices at busy hubs receive appropriate consideration.
HHAN was rigorously tested on large networks, including an abstracted map of Manhattan with hundreds of intersections. Under heavy traffic conditions, HHAN achieved remarkable improvements, reducing average travel time by up to 15.9% compared to adaptive baselines. Like AN, it consistently maintained a 100% routing success rate, demonstrating its robustness and ability to prevent gridlock even under stress.
Also Read:
- Smart Traffic Signals: AI Agents for Safer and Smoother Urban Roads with Emergency Vehicle Priority
- Spatiotemporal Error Adjustment Enhances Deep Learning Traffic Models
Underlying Innovations and Future Impact
Beyond the core models, the research also introduced a clever method for representing vehicle destinations using a Z-order curve. This technique helps neural networks understand spatial relationships more effectively, contributing to better routing decisions. The success of AN and HHAN underscores the power of network-constrained multi-agent reinforcement learning for creating scalable, coordinated, and congestion-aware routing solutions for intelligent transportation systems.
This research paves the way for next-generation traffic management that can adapt to real-time conditions, optimize urban traffic flow without costly infrastructure expansion, and contribute to more sustainable and resilient cities. The open-source code for this research is also publicly available, fostering further development and application in the field.


