Keywords: Model-based reinforcement learning, Model-based Exploration, Generative model, World model
TL;DR: We propose MoGE, which enhances the Off-policy RL exploration by critical experiences generaion, leading to significant improvements in sample efficiency and performance ceilings across various tasks.
Abstract: Exploration is crucial in Reinforcement Learning (RL) as it enables the agent to understand the environment for better decision-making. Existing exploration methods fall into two paradigms: active exploration, which injects stochasticity into the policy but struggles in high-dimensional environments, and passive exploration, which manages the replay buffer to prioritize under-explored regions but lacks sample diversity. To address the limitation in passive exploration, we propose Modelic Generative Exploration (MoGE), which augments exploration through the generation of under-explored critical states and synthesis of dynamics-consistent experiences. MoGE consists of two components: (1) a diffusion generator for critical states under the guidance of entropy and TD error, and (2) a one-step imagination world model for constructing critical transitions for agent learning. Our method is simple to implement and seamlessly integrates with mainstream off-policy RL algorithms without structural modifications. Experiments on OpenAI Gym and DeepMind Control Suite demonstrate that MoGE, as an exploration augmentation, significantly enhances efficiency and performance in complex tasks.
Primary Area: Reinforcement learning (e.g., decision and control, planning, hierarchical RL, robotics)
Submission Number: 26609
Loading