A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han; Youngchul Sung

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Reinforcement Learning, entropy regularization, max-min optimization framework, exploration

Abstract: In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: We propose a max-min entropy framework to practically overcome the limitation of the soft actor-critic algorithm implementing the maximum entropy RL.

Supplementary Material: zip

Code: https://github.com/seungyulhan/mme

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/a-max-min-entropy-framework-for-reinforcement/code)

13 Replies

Loading