Robust Multi-vehicle Routing with Communication Enhanced Multi-agent Reinforcement Learning for Last-Mile Logistics

Hai Wang, Shuai Wang, Xiaolei Zhou

Published: 2024, Last Modified: 21 Jan 2026APWeb/WAIM (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The Vehicle Routing Problem (VRP) is crucial for optimizing logistics in applications such as express systems, industrial warehousing, and on-demand delivery. Last-mile logistics present unique challenges due to dynamic and uncertain pickup demands, requiring real-time routing adjustments and efficient management of delivery schedules. Existing heuristic-based methods rely heavily on manual rules and are inadequate for highly dynamic environments, while RL-based methods lack models for cooperative Problems. To address these issues, we propose the Communication Enhanced Multi-agent Reinforcement Learning (CEMRL) framework. CEMRL utilizes Context Encoding to unify environment features and local observations and employs a transformer-based communication enhancement module for efficient multi-agent communication. Our extensive experiments on a real-world dataset demonstrate that CEMRL significantly outperforms state-of-the-art baselines in travel distance and overdue rates, validating its effectiveness in complex logistics scenarios.