Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Stochastic Settings

Rushang Karia; Pulkit Verma; Gaurav Vipat; Siddharth Srivastava

Epistemic Exploration for Generalizable Planning and Learning in Non-Stationary Stochastic Settings

Rushang Karia, Pulkit Verma, Gaurav Vipat, Siddharth Srivastava

Published: 28 Oct 2023, Last Modified: 04 Jan 2024GenPlan'23EveryoneRevisionsBibTeX

Abstract: Reinforcement Learning (RL) provides a convenient framework for sequential decision making when closed-form transition dynamics are unavailable and can frequently change. However, the high sample complexity of RL approaches limits their utility in the real-world. This paper presents an approach that performs meta-level exploration in the space of models and uses the learned models to compute policies. Our approach interleaves learning and planning allowing data-efficient, task-focused sample collection in the presence of non-stationarity. We conduct an empirical evaluation on benchmark domains and show that our approach significantly outperforms baselines in sample complexity and easily adapts to changing transition systems across tasks.

Submission Number: 77

Loading