Efficient Reinforcement Learning Using State-Action Uncertainty with Multiple Heads

Published: 01 Jan 2023, Last Modified: 07 May 2024ICANN (8) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In reinforcement learning, an agent learns optimal actions for achieving a task by maximizing rewards in an environment. During learning, the agent decides its action for exploration or exploitation at each time. In exploration, the agent searches for a new experience defined by the state, action, reward, and next state. In exploitation, on the other hand, the agent tries to maximize the rewards based on the experiences. Exploration-exploitation trade-off is an important issue in reinforcement learning. In previous work, this trade-off is achieved based on how uncertain what the agent already learns about the environment. While only simple criteria for this uncertainty are explored in the literature, this paper evaluates more uncertainty criteria for efficient reinforcement learning. Our novel uncertainty criteria uses agent’s multiple decisions at the same time. In addition, we also propose to employ the advantage of the multiple decisions for bridging the gap between exploration and exploitation by our novel exploitation mode. Extensive experiments verify the effectiveness of our approaches for efficient learning. We also show the learning efficiencies of different learning strategies in order to know which strategy is better for each task.
Loading