Keywords: Colored Noise, Deep Reinforcement Learning, On-policy Algorithms, Continuous Control
Abstract: Colored noise, a class of temporally correlated noise processes, has shown
promising results for improving exploration in deep reinforcement learning
for both off-policy and on-policy algorithms. However, it is unclear how
temporally correlated colored noise affects policy learning apart from changing
exploration properties. In this paper, we investigate the implications of
colored noise on on-policy deep reinforcement learning in a simplified setting,
considering linear dynamics and a linear policy under quadratic costs. We
derive a closed-form solution for the expected cost, revealing that colored
noise affects both the expected cost and the optimal policy. Notably, the
cost splits into a state-cost part equal to the unperturbed system’s cost and
a noise-cost term, affecting the policy, but independent of the initial state.
While the cost changes depending on the noise, the expected trajectory
remains independent of the noise color for a given linear policy. Far from
the goal state, the state cost dominates, and the effect due to the noise
is negligible: the policy approaches the optimal policy of the unperturbed
system. Near the goal state, the noise cost dominates, changing the optimal
policy.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Jakob_Hollenstein1
Track: Regular Track: unpublished work
Submission Number: 160
Loading