How Robust Reinforcement Learning Enables Courier-Friendly Route Planning for Last-Mile Delivery?

Published: 14 Jun 2025, Last Modified: 19 Jul 2025ICML 2025 Workshop PRALEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Reinforcement Learning, Last-Mile Delivery, Robust Optimization, Human-centric Design
TL;DR: This paper proposes a courier-friendly Robust RL-based Smooth and Stable Routing Algorithm for dynamic last-mile delivery.
Track: Long Paper (up to 9 pages)
Abstract: Last-mile delivery (LMD) systems increasingly face dynamic customer demands that introduce uncertainty and lead to unstable delivery routes, reducing efficiency and placing cognitive burdens on couriers. To address this, we propose R$^3$S$^2$Route, a Robust Regularizer-enhanced RL-based Smooth and Stable Routing Algorithm that learns courier-friendly policies under state uncertainty. Our method adopts an actor-critic reinforcement learning framework and incorporates a robustness regularizer to penalize policy sensitivity to input perturbations. We formally define route smoothness and stability as courier-friendliness metrics, and integrate them into the learning framework to produce routing policies that are both geometrically intuitive and keep spatial-temporal consistent. Experimental results demonstrate that R$^3$S$^2$Route achieves up to 59.68\% improvement in route smoothness and 14.29\% in route stability, while maintaining low travel distances and time-window violation rates, outperforming several baselines in dynamic delivery environments.
Format: We have read the camera-ready instructions, and our paper is formatted with the provided template.
De-Anonymization: This submission has been de-anonymized.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 17
Loading