Learning to Solve the Min-Max Mixed-Shelves Picker-Routing Problem via Hierarchical and Parallel Decoding

Published: 04 Apr 2025, Last Modified: 09 Jun 2025LION19 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Tracks: Special Session 1: (Deep) Reinforcement Learning in OR Optimization
Keywords: Neural Combinatorial Optimization, Multi-Agent Reinforcement Learning, Picker Routing, Mixed-Shelves Warehouses
TL;DR: A neural solver for the min-max variant of the picker routing problem in mixed-shelves warehouses.
Abstract: The Mixed-Shelves Picker Routing Problem (MSPRP) is a fundamental challenge in warehouse logistics, where pickers must navigate a mixed-shelves environment to retrieve SKUs efficiently. Traditional heuristics and optimization-based approaches struggle with scalability, while recent machine learning methods often rely on sequential decision-making, leading to high solution latency and suboptimal agent coordination. In this work, we propose a novel hierarchical and parallel decoding approach for solving the min-max variant of the MSPRP via multi-agent reinforcement learning. While our approach generates a joint distribution over agent actions, allowing for fast decoding and effective picker coordination, our method introduces a sequential action selection to avoid conflicts in the multi-dimensional action space. Experiments show state-of-the-art performance in both solution quality and inference speed, particularly for large-scale and out-of-distribution instances.
Submission Number: 56
Loading