Pareto Policy AdaptationDownload PDFOpen Website

2022 (modified: 17 Apr 2023)ICLR 2022Readers: Everyone
Abstract: We present a policy gradient method for Multi-Objective Reinforcement Learning under unknown, linear preferences. By enforcing Pareto stationarity, a first-order condition for Pareto optimality, we...
0 Replies

Loading