Keywords: reinforcement learning, deep learning, goal-condition reinforcement learning, long horizon, navigation, quasimetrics
TL;DR: reinforcement learning, deep learning, goal-condition reinforcement learning, long horizon, navigation, quasimetrics
Abstract: Offline Goal-Conditioned Reinforcement Learning seeks to train agents to reach specified goals using previously collected reward-free data. Scaling that promise to long-horizon tasks with complex dynamics remains challenging, notably due to compounding value‑estimation errors. Principled geometric learning offers a potential solution to address these issues. Following this insight in our research, we introduce Projective Quasimetric Planning (ProQ), a compositional framework that learns a differentiable asymmetric distance and then repurposes it, firstly as a repulsive energy forcing a sparse set of keypoints to uniformly spread over learned latent space, secondly as a structured directional cost guiding towards proximal sub-goals. ProQ couples this geometry with a Lagrangian out-of-distribution detector to ensure the keypoints to stay within reachable areas. By unifying metric learning, keypoint coverage, and goal‑conditioned control, our approach produces meaningful sub‑goals and robustly drives long‑horizon goal‑reaching on diverse navigation and manipulation benchmarks.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Anthony_Kobanda1
Track: Regular Track: unpublished work
Submission Number: 144
Loading