Unconstrained Models as Constrained Problem Solvers: Duality-Driven Adaptation without Retraining

16 Sept 2025 (modified: 20 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Zero-shot constrained reinforcement learning; Forward-backward (FB) framework; Latent-space reparameterization
TL;DR: We extend the forward-backward framework to constrained reinforcement learning by embedding rewards and costs into a shared latent space.
Abstract: We present a novel extension of the forward-backward (FB) representation framework that enables zero-shot constrained reinforcement learning (RL) by embedding both reward and cost functions into a shared latent space. While existing FB methods excel in generalizing across rewards, they fail to account for constraints, a critical limitation in real-world applications where agents must satisfy varying cost budgets or safety requirements. Our approach overcomes this gap through a latent-space reparameterization grounded in Lagrangian duality, allowing efficient inference of constraint-aware policies without requiring any retraining at deployment. By leveraging a latent-space reparameterization grounded in Lagrangian duality, our method allows for efficient inference of constraint-aware policies. Extensive experiments on the ExORL benchmark demonstrate that our method achieves superior task performance while adhering to cost constraints, consistently outperforming prior FB-based and primal-dual baselines. These results highlight the effectiveness and practicality of latent-space constrained policy inference for scalable and safe RL.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 7755
Loading