Generalized Policy Improvement Algorithms With Theoretically Supported Sample Reuse

James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras

Published: 01 Feb 2025, Last Modified: 27 Jan 2026IEEE Transactions on Automatic ControlEveryoneRevisionsCC BY-SA 4.0

Abstract: We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a tradeoff between two important deployment requirements for real-world control: 1) practical performance guarantees; and 2) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.

External IDs:doi:10.1109/tac.2024.3454011