Toggle navigation
OpenReview
.net
Login
×
Go to
ICLR 2022
homepage
Pareto Policy Adaptation
Panagiotis Kyriakis
,
Jyotirmoy Deshmukh
,
Paul Bogdan
2022 (modified: 17 Apr 2023)
ICLR 2022
Readers:
Everyone
Abstract:
We present a policy gradient method for Multi-Objective Reinforcement Learning under unknown, linear preferences. By enforcing Pareto stationarity, a first-order condition for Pareto optimality, we...
0 Replies
Loading