Boosting the Actor with Dual Critic

Bo Dai; Albert Shaw; Niao He; Lihong Li; Le Song

Boosting the Actor with Dual Critic

Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song

15 Feb 2018 (modified: 22 Jun 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves the state-of-the-art performances across several benchmarks.

TL;DR: We propose Dual Actor-Critic algorithm, which is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation. The algorithm achieves the state-of-the-art performances across several benchmarks.

Keywords: reinforcement learning, actor-critic algorithm, Lagrangian duality

Data: [MuJoCo](https://paperswithcode.com/dataset/mujoco)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/boosting-the-actor-with-dual-critic/code)

8 Replies

Loading