HiPPO: Enhancing proximal policy optimization with highlight replay

Published: 01 Jan 2025, Last Modified: 22 Jul 2025Pattern Recognit. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a novel highlight replay to enhance proximal policy optimization HiPPO.•We selected three key properties as the basis for highlight replaying.•Reward-constrained optimization introduced alleviates the constraint of policy similarity.•HiPPO outdoes state-of-the-art approximate policy algorithms on MuJoCo.
Loading