State Decomposition for Model-free Partially observable Markov Decision Process

Yide Yu; Yan Ma; Yue Liu

State Decomposition for Model-free Partially observable Markov Decision Process

Yide Yu, Yan Ma, Yue Liu

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: POMDP, Reinforcement Learning, Decomposition, Shannon Entropy

TL;DR: This paper proposes a novel theory of state decomposition in POMDP and a simple algorithm to estimate the gap between state and observation.

Abstract: As an essential part of partially observable Markov theory, the measurement of the gap between states and observations is an important issue. In this paper, we propose a novel theory of state decomposition and a simple model-free metric algorithm ($\lambda$-algorithm) for estimating the gap between states and observations in the partially observable Markov decision process with a stationary environment with some missing state conditions. To verify our idea, we design a dimension ablation method to simulate different gaps in the cliff-walking experiment with Q-learning and Sarsa. The simulation results show that $\lambda$ increases steadily as more dimensions are ablated. This proves that $\lambda$ can adequately measure the gap.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Supplementary Material: zip

5 Replies

Loading