Enhancing Actor-Critic Decision-Making with Afterstate Models for Continuous Control

Norio Kosaka

Enhancing Actor-Critic Decision-Making with Afterstate Models for Continuous Control

Norio Kosaka

Published: 19 Jun 2024, Last Modified: 26 Jul 2024ARLET 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: afterstate RL, actor-critic

TL;DR: Enhancing actor-critic algorithms by integrating 'afterstate' dynamics, our research focuses on improving critic evaluation in continuous control tasks, leading to more efficient decision-making.

Abstract: Humans inherently consider the consequences of their actions during decision-making, often visualizing the outcomes before executing a choice. Actor-critic algorithms, a cornerstone in reinforcement learning, typically involve a critic evaluating actions proposed by the actor. This evaluation mechanism, unlike in value-based methods such as Deep Q-Network, explicitly incorporates action information. However, this setup imposes a dual burden on the critic: (I) understanding the immediate utility of an action on the environment, and (II) discerning its future implications. Our research aims to offload the first component by leveraging a model-based RL framework to enhance the critic's predictive capabilities. We posit that actions culminating in the same subsequent state are functionally equivalent. Through experiments in continuous control tasks like robot operation and artistic creation, we demonstrate the superiority of our approach, establishing a new paradigm in actor-critic methodologies.

Submission Number: 23

Loading