Reinforcement Learning Under Partial Observability Guided by Learned Environment Models

Edi Muskardin; Martin Tappler; Bernhard K. Aichernig; Ingo Pill

Reinforcement Learning Under Partial Observability Guided by Learned Environment Models

Edi Muskardin, Martin Tappler, Bernhard K. Aichernig, Ingo Pill

Published: 01 Jan 2023, Last Modified: 11 Jun 2024iFM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Reinforcement learning and planning under partial observability is notoriously difficult. In this setting, decision-making agents need to perform a sequence of actions with incomplete information about the underlying state of the system. As such, methods that can act in the presence of incomplete state information are of special interest to machine learning, planning, and control communities. In the scope of this paper, we consider environments that behave like a partially observable Markov decision process (POMDP) with known discrete actions, while assuming no knowledge about its structure or transition probabilities.

Loading