Routing with Rich Text Queries via Next-Vertex Prediction Models

Eric Zhao; Pranjal Awasthi; Zhengdao Chen; Sreenivas Gollapudi; Daniel Delling

Routing with Rich Text Queries via Next-Vertex Prediction Models

Eric Zhao, Pranjal Awasthi, Zhengdao Chen, Sreenivas Gollapudi, Daniel Delling

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Text based routing, next token prediction

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Autoregressive modeling of text via transformers has led to recent breakthroughs in language. In this work, we study the effectiveness of this framework for routing problems on graphs. In particular, we aim to develop a learning based routing system that can process rich natural language based queries indicating various desired criteria and produce near optimal routes from the source to the destination. Furthermore, the system should be able to generalize to new geographies not seen during training time. Solving the above problem via combinatorial approaches is challenging since one has to learn specific cost functions over the edges of the graphs for each possible type of query. We instead investigate the efficacy of autoregressive modeling for routing. We propose a multimodal architecture that jointly encodes text and graph data and present a simple way of training the architecture via {\em next token prediction}. In particular, given a text query and a prefix of a ground truth path, we train the network to predict the next vertex on the path. While a priori this approach may seem suboptimal due to the local nature of the predictions made, we show that when done at scale, this yields near optimal performance. We demonstrate the effectiveness of our approach via extensive experiments on synthetic graphs as well as graphs from the OpenStreetMap repository. We also present recommendations for the training techniques, architecture choices and the inference algorithms needed to get the desired performance for such problems.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6192

Loading