Decision Stacks: Flexible Reinforcement Learning via Modular Generative Models

Published: 19 Jun 2023, Last Modified: 28 Jul 20231st SPIGM @ ICML PosterEveryoneRevisionsBibTeX
Keywords: reinforcement learning, generative models, offline RL, sequential decision making, modularity
TL;DR: Decision Stacks, a modular generative framework for goal-conditioned RL, models observations, rewards, and actions with maximal expressivity, leading to superior performance and flexibility across diverse offline RL tasks in MDPs and POMDPs.
Abstract: Reinforcement learning presents an attractive paradigm to reason about several distinct aspects of sequential decision making, such as specifying complex goals, planning future observations and actions, and critiquing their utilities, demanding a balance between expressivity and flexible modeling for efficient learning and inference. We present Decision Stacks, a probabilistic generative framework that decomposes goal-conditioned policy agents into 3 generative modules which simulate the temporal evolution of observations, rewards, and actions. Our framework guarantees both expressivity and flexibility in designing in- dividual modules to account for key factors such as architectural bias, optimization objective and dynamics, transferability across domains, and in- ference speed. Our empirical results demonstrate the effectiveness of Decision Stacks for offline policy optimization for several MDP and POMDP environments.
Submission Number: 89
Loading