Evidential Latent World Models for Safe Model-based Reinforcement Learning
Keywords: Reinforcement Learning, Uncertainty Estimation, Model-Based Reinforcement Learning
Abstract: Uncertainty estimation is crucial for deploying reinforcement learning in safety‑critical domains such as robotics and autonomous systems. This work introduces Model-based Uncertainty-Aware Reinforcement Learning (MUARL), a constrained model-based reinforcement learning framework that augments TD-MPC2 with evidential deep learning to estimate epistemic and aleatoric uncertainty in a single dynamics-model forward pass. MUARL integrates these estimates into a dual-constraint Model Predictive Path Integral planner that jointly penalizes predicted safety-cost violations and model uncertainty via adaptive Lagrangian multipliers, enforcing safety directly at planning time. In a dynamic unicycle-car navigation task, evidential uncertainty yields markedly better out-of-distribution detection than normalizing flows and stochastic ensembles, enabling safer exploration around constraint regions. On Safety Gymnasium navigation benchmarks, MUARL variants achieve higher safety feasibility and lower cumulative constraint costs than model-free baselines and alternative model-based methods, while maintaining competitive task performance. Together, these results show that evidential uncertainty can be integrated into real-time sampling-based planners with modest computational overhead, providing a practical path toward uncertainty-aware constrained MBRL for safety-critical autonomous systems.
Submission Number: 125
Loading