Neural ODE and SDE Models for Adaptation and Planning in Model-Based Reinforcement Learning

TMLR Paper5274 Authors

02 Jul 2025 (modified: 10 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We investigate neural ordinary and stochastic differential equations (neural ODEs and SDEs) to model stochastic dynamics in fully and partially observed environments within a model-based reinforcement learning (RL) framework. Through a sequence of simulations, we show that neural SDEs more effectively capture transition dynamics’ inherent stochasticity, enabling high-performing policies with improved sample efficiency in challenging scenarios. We leverage neural ODEs and SDEs for efficient policy adaptation to changes in environment dynamics via inverse models, requiring only limited interactions with the new environment. To address partial observability, we introduce a latent SDE model that combines an ODE and a GAN-trained stochastic component in latent space. This model matches or exceeds the performance of the state-based SDE variant and outperforms ODE-based alternatives across stochastic variants of continuous control benchmarks, providing the first empirical demonstration of action-conditional latent neural SDEs for planning in such settings.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Lihong_Li1
Submission Number: 5274
Loading