spot_img
HomeResearch & DevelopmentOptimizing Urban Mobility: A New Multi-Agent Reinforcement Learning Approach...

Optimizing Urban Mobility: A New Multi-Agent Reinforcement Learning Approach for Resource Allocation

TLDR: HAG-PS is a new multi-agent reinforcement learning system designed to dynamically allocate urban mobility resources like shared bikes. It uses a hierarchical structure, adaptive agent grouping, and learnable identity embeddings to efficiently share policies and manage resources in large city environments. Tested with NYC bike sharing data, HAG-PS significantly improved bike availability and rebalancing compared to existing methods.

Urban environments face a constant challenge in balancing the demand and supply of mobility resources like shared bikes, e-scooters, and ride-sharing vehicles. Efficient allocation of these resources is vital for smooth urban mobility. Traditional methods often struggle with the dynamic nature of city environments and the sheer scale of operations.

A new research paper introduces a novel approach called Hierarchical Adaptive Grouping-based Parameter Sharing (HAG-PS) to tackle these complex issues using multi-agent reinforcement learning (MARL). The core idea is to enable regional coordinators (agents) to dynamically and adaptively share policies for distributing mobility resources, while also ensuring memory efficiency for city-wide deployment.

Addressing Key Challenges

The researchers identified two primary challenges in applying MARL to mobility resource allocation:

  • How to dynamically and adaptively share the mobility resource allocation policy among various coordinating agents.
  • How to achieve scalable and memory-efficient parameter sharing in a large urban setting.

HAG-PS addresses these by incorporating several innovative designs. It uses a hierarchical approach that considers both global and local information about mobility resource states, such as their distribution across different regions. This allows for more dynamic and adaptive policy sharing. Furthermore, the system employs an adaptive agent grouping mechanism that can split or merge groups of agents based on how similar their encoded trajectories (states, actions, and rewards) are. This ensures that agents with similar needs or behaviors can share policies effectively. To allow for individual agent specialization beyond simple policy copying, HAG-PS also includes learnable identity (ID) embeddings for each agent.

How HAG-PS Works

The system discretizes the urban service area into numerous rectangular regions and divides the time horizon into intervals. Each agent, acting as a re-allocator, manages resources within a specific region at each time interval. The system considers a global state, which includes temporal information (time of day, day of week), distribution of available resources, historical pickup statistics, and urban environment features like roads and points of interest.

Agents determine actions by deciding how many mobility resources to relocate to adjacent regions (north, south, east, and west). The system then updates resource availability based on these actions, pickup requests, and drop-offs. A reward function guides the learning process, favoring high service ratios (fulfilled demand), penalizing unfulfilled demand, and discouraging excessive relocation costs.

The hierarchical adaptive grouping is central to HAG-PS. It dynamically assigns roles to agents by forming global groups for macro-coordination (e.g., for a district) and local groups for micro-coordination (e.g., for neighborhoods). Agents within a global group share a feature network, while local groups maintain compact actor-critic networks. After each learning episode, agents encode their recent trajectories, and these embeddings are used to decide whether to split or merge groups. For instance, if agents within a group become too dissimilar in their behavior, the group might split. Conversely, similar groups might merge. The system also adaptively adjusts how frequently these regrouping operations occur, making them less frequent when the system’s behavior stabilizes.

Experimental Validation

The researchers conducted extensive experiments using real-world NYC bike sharing data, comprising over 1.2 million trips from January 2024. The study area covered 106 one-square-kilometer regions in central Manhattan. HAG-PS was compared against several baseline approaches, including methods with no sharing, full sharing, selective sharing, and dynamic sharing.

The results demonstrated HAG-PS’s superior performance. It achieved a fulfilled service ratio of 77.21% and rebalanced 472,212 bikes, outperforming all other baselines. Ablation studies, where specific components of HAG-PS were removed, highlighted the importance of each design element: the identity embeddings, the split-merge operations, the hierarchical grouping, and the adaptive regrouping period all contributed significantly to the overall performance. For example, removing the hierarchical adaptive grouping led to a 4% decrease in fulfilled service ratio, underscoring its critical role.

Also Read:

Conclusion and Future Directions

HAG-PS offers a robust solution for dynamic mobility resource allocation, effectively addressing the challenges of adaptive policy sharing and memory efficiency in urban-scale settings. The successful application to NYC bike sharing data validates its potential. Future work will involve expanding experimental studies and evaluating the system with multi-city data. For more technical details, you can refer to the full research paper available at arXiv.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -