Keywords: afterstate RL, actor-critic
TL;DR: Enhancing actor-critic algorithms by integrating 'afterstate' dynamics, our research focuses on improving critic evaluation in continuous control tasks, leading to more efficient decision-making.
Abstract: Humans inherently consider the consequences of their actions during decision-making, often visualizing the outcomes before executing a choice. Actor-critic algorithms, a cornerstone in reinforcement learning, typically involve a critic evaluating actions proposed by the actor. This evaluation mechanism, unlike in value-based methods such as Deep Q-Network, explicitly incorporates action information. However, this setup imposes a dual burden on the critic: (I) understanding the immediate utility of an action on the environment, and (II) discerning its future implications. Our research aims to offload the first component by leveraging a model-based RL framework to enhance the critic's predictive capabilities. We posit that actions culminating in the same subsequent state are functionally equivalent. Through experiments in continuous control tasks like robot operation and artistic creation, we demonstrate the superiority of our approach, establishing a new paradigm in actor-critic methodologies.
Submission Number: 23
Loading