Quantile Regression Reinforcement Learning with State Aligned Vector RewardsDownload PDF

30 Sept 2018 (modified: 05 May 2023)NIPS 2018 Workshop Spatiotemporal Blind SubmissionReaders: Everyone
Keywords: deep reinforcement learning, quantile regression, vector reward
TL;DR: We train with state aligned vector rewards an agent predicting state changes from action distributions, using a new reinforcement learning technique inspired by quantile regression.
Abstract: The sample efficiency of deep reinforcement learning (DRL) algorithms is limited by the weak scalar training signal. We propose to use state aligned vector rewards to capitalize on the spatiotemporal nature of reaching problems and show that a state change distribution can be learned given an action distribution. Our agent, trained with a new DRL method inspired by quantile regression, is able to learn multiple times faster in high dimensional state spaces than a classical DRL algorithm.
5 Replies

Loading