ExID: Offline RL with Intuitive Expert Insights in Limited-Data Settings

Briti Gangopadhyay; Zhao Wang; Jia-Fong Yeh; Shingo Takamatsu

ExID: Offline RL with Intuitive Expert Insights in Limited-Data Settings

Briti Gangopadhyay, Zhao Wang, Jia-Fong Yeh, Shingo Takamatsu

13 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Offline Reinforcement Learning, Domain knowledge

TL;DR: The paper introduces a novel domain knowledge-based regularization technique to enhance offline Reinforcement Learning (RL) performance in scenarios with limited data and partially omitted states.

Abstract: With the ability to learn from static datasets, Offline Reinforcement Learning (RL) emerges as a compelling avenue for real-world applications. However, state-of-the-art offline RL algorithms perform sub-optimally when confronted with limited data confined to specific regions within the state space. The performance degradation is attributed to the inability of offline RL algorithms to learn appropriate actions for rare or unseen observations. This paper proposes a novel domain knowledge-based regularization technique and adaptively refines the initial domain knowledge to considerably boost performance in limited data with partially omitted states. The key insight is that the regularization term mitigates erroneous actions for sparse samples and unobserved states covered by domain knowledge. Empirical evaluations on standard discrete environment datasets demonstrate a substantial average performance increase compared to ensemble of domain knowledge and existing offline RL algorithms operating on limited data.

Supplementary Material: zip

Primary Area: Reinforcement learning

Submission Number: 5638

Loading