Pink Noise LQR: How does Colored Noise affect the Optimal Policy in RL?

Published: 17 Jul 2025, Last Modified: 06 Sept 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Colored Noise, Deep Reinforcement Learning, On-policy Algorithms, Continuous Control
Abstract: Colored noise, a class of temporally correlated noise processes, has shown promising results for improving exploration in deep reinforcement learning for both off-policy and on-policy algorithms. However, it is unclear how temporally correlated colored noise affects policy learning apart from changing exploration properties. In this paper, we investigate the implications of colored noise on on-policy deep reinforcement learning in a simplified setting, considering linear dynamics and a linear policy under quadratic costs. We derive a closed-form solution for the expected cost, revealing that colored noise affects both the expected cost and the optimal policy. Notably, the cost splits into a state-cost part equal to the unperturbed system’s cost and a noise-cost term, affecting the policy, but independent of the initial state. While the cost changes depending on the noise, the expected trajectory remains independent of the noise color for a given linear policy. Far from the goal state, the state cost dominates, and the effect due to the noise is negligible: the policy approaches the optimal policy of the unperturbed system. Near the goal state, the noise cost dominates, changing the optimal policy.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Jakob_Hollenstein1
Track: Regular Track: unpublished work
Submission Number: 160
Loading