Meta-GC-TTT: Training Offline Goal-Conditioned Policies for Test-Time Adaptation

Antonio Mari; Marco Bagatella; Jonas Hübotter; Andreas Krause

Meta-GC-TTT: Training Offline Goal-Conditioned Policies for Test-Time Adaptation

Antonio Mari, Marco Bagatella, Jonas Hübotter, Andreas Krause

Published: 25 May 2026, Last Modified: 27 May 2026DEMO 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: meta-learning, test-time training, offline RL, GC-TTT

Abstract: In offline goal-conditioned reinforcement learning, Test-Time Training (TTT) can specialize a pre-trained policy to the current state and goal at deployment. This turns a broad goal-conditioned policy into a query-specific expert. Yet, standard offline pre-training optimizes the policy before TTT, not after it. As a result, the policy is not trained for the gradient dynamics it will face at test time. We introduce Meta-GC-TTT, a framework for learning test-time-trainable goal-conditioned policies. Meta-GC-TTT samples state-goal tasks from the offline dataset, adapts the policy with TTT, and updates the base policy for post-TTT performance. Our evaluation on the OGBench loco-navigation suite demonstrates that meta-learned initializations significantly improve zero-shot performance and achieve a (3 − 5×) increase in adaptation efficiency. Overall, offline goal-conditioned policies should not only be trained not only to act, but also to adapt.

Submission Number: 107

Loading