Abstract: We investigate neural ordinary and stochastic differential equations (neural ODEs and SDEs) to model stochastic dynamics in fully and partially observed environments within a model-based reinforcement learning (RL) framework. Through a sequence of simulations, we show that neural SDEs more effectively capture transition dynamics’ inherent stochasticity, enabling high-performing policies with improved sample efficiency in challenging scenarios. We leverage neural ODEs and SDEs for efficient policy adaptation to changes in environment dynamics via inverse models, requiring only limited interactions with the new environment. To address partial observability, we introduce a latent SDE model that combines an ODE and a GAN-trained stochastic component in latent space. Policies derived from this model offer a strong baseline, outperforming or matching general model-based and model-free approaches across stochastic continuous-control benchmarks. This work illustrates the applicability of action-conditional latent SDEs for RL planning in environments with stochastic transitions. Our code is available at: https://github.com/ChaoHan-UoS/NeuralRL.
Submission Length: Long submission (more than 12 pages of main content)
Code: https://github.com/ChaoHan-UoS/NeuralRL
Assigned Action Editor: ~Lihong_Li1
Submission Number: 5274
Loading