Toggle navigation
OpenReview
.net
Login
×
Go to
ICML 2022
homepage
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
Vihang P. Patil
,
Markus Hofmarcher
,
Marius-Constantin Dinu
,
Matthias Dorfer
,
Patrick M. Blies
,
Johannes Brandstetter
,
José Antonio Arjona-Medina
,
Sepp Hochreiter
2022 (modified: 24 Apr 2023)
ICML 2022
Readers:
Everyone
Abstract:
Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redi...
0 Replies
Loading