Adaptive Quasimetric Mapping : Principled Topological Abstraction for Robust Offline Goal-Conditioned Navigation
TL;DR: Adaptive Quasimetric Mapping (AQM) learns a time-to-reach quasimetric from offline data to build a sparse topological graph for goal-conditioned navigation, enabling efficient planning and test-time replanning under topology changes.
Abstract: Goal-Conditioned Reinforcement Learning aims to design agents that can reach specified goals, notably from previously collected trajectories in the offline setting. In this context, graph-based approaches have been proposed to mitigate compounding value-estimation errors in long-horizon navigation tasks. However, existing approaches typically rely on dense keypoint coverage of the dataset support, resulting in computationally expensive planning. Moreover, they lack explicit mechanisms to adapt to topological changes (e.g., new obstacles), hindering deployment in live applications such as video game environments. To address these two shortcomings, we introduce Adaptive Quasimetric Mapping (AQM), an offline framework leveraging a “time-to-reach” quasimetric learned from the available data. Crucially, it builds a sparse cover of the dataset support, as a greedy approximation to a dominating set problem. At test-time, the resulting graph is carefully pruned by comparing the observed edge traversal time against a time-to-reach budget derived from the quasimetric, thus enabling zero-shot replanning. Empirically, we evaluate AQM on navigation tasks ranging from a classical to a video-game-like benchmark evaluating adaptation across tasks. We show that AQM achieves competitive performance while requiring up to 100× fewer keypoints than prior approaches, hence demonstrating the relevance of topological abstraction for goal-conditioned navigation.
Lay Summary: Robots, game characters, and other AI agents often need to reach a goal in a complex environment, such as finding a route through a maze or a game level. Learning this only by trial and error can be expensive, slow, or unsafe, so we study how an agent can learn from trajectories that were already recorded.
Our method, Adaptive Quasimetric Mapping, builds a compact navigation map from this offline data. It learns whether two places are close, and notably how hard it is to travel from one place to another, since going from A to B may be easier than going from B to A. Using this information, it selects a small set of useful waypoints and connects them into a graph for long-distance planning.
At test time, if a path becomes blocked, the agent can mark that connection as unreliable and search for another route on the graph. This makes planning more efficient and gives the agent a limited form of adaptation when the environment changes. The method cannot invent routes or skills that were never present in the data, but it can better reuse known routes when some connections become unavailable.
Primary Area: Reinforcement Learning->Batch/Offline
Keywords: Adaptation, Deep Learning, Graph-based Navigation, Goal-Conditioned Reinforcement Learning, Navigation, Offline Reinforcement Learning, Replanning, Scalable, Test-time Adaptation, Transfer Learning, Zero-shot Adaptation
Originally Submitted PDF: pdf
Submission Number: 23758
Loading