TL;DR: We present SHIELD, a model that leverages sparsity and hierarchy for the more realistic Multi-Task Multi-Distribution VRP setting, a step towards foundation models for routing problems.
Abstract: Recent advances toward foundation models for routing problems have shown great potential of a unified deep model for various VRP variants. However, they overlook the complex real-world customer distributions. In this work, we advance the Multi-Task VRP (MTVRP) setting to the more realistic yet challenging Multi-Task Multi-Distribution VRP (MTMDVRP) setting, and introduce SHIELD, a novel model that leverages both *sparsity* and *hierarchy* principles. Building on a deeper decoder architecture, we first incorporate the Mixture-of-Depths (MoD) technique to enforce sparsity. This improves both efficiency and generalization by allowing the model to dynamically select nodes to use or skip each decoder layer, providing the needed capacity to adaptively allocate computation for learning the task/distribution specific and shared representations. We also develop a context-based clustering layer that exploits the presence of hierarchical structures in the problems to produce better local representations. These two designs inductively bias the network to identify key features that are common across tasks and distributions, leading to significantly improved generalization on unseen ones. Our empirical results demonstrate the superiority of our approach over existing methods on 9 real-world maps with 16 VRP variants each.
Lay Summary: Classical solvers for Vehicle Routing Problems (VRP) can handle multiple variants easily and are distribution agnostic. Recently, neural VRP solvers have started to handle multiple variants in the form of multi-task learning. Real-world scenarios, however, display both variations in task and underlying data distributions. Thus, we attempt to improve the generalization ability of neural solvers given such complexities.
Motivated by regularization principles in the VC-dimension, this work presents SHIELD, a powerful model that leverages sparsity and hierarchy to handle the dynamics and challenges of the Multi-task Multi-distribution VRP scenario. We increase the depth of the decoder but introduce Mixture-of-Depths layers capable of controlling and adapting the network's compute power based on the necessary task and distribution variations. We also introduce a context-based latent clustering module that adapts coarse-grained representations according to tasks and distributions.
Based on extensive experiments across 9 real-world maps and 16 VRP variants, we empirically show that the sparsity and hierarchy principles are paramount to improving neural solvers' generalization ability, paving the path to foundation VRP models.
Primary Area: Optimization->Discrete and Combinatorial Optimization
Keywords: vehicle routing problem, learning to optimize
Submission Number: 8616
Loading