Keywords: reinforcement learning, computer vision, data augmentation, robotics
Abstract: $Q$-learning algorithms are appealing for real-world applications due to their data-
efficiency, but they are very prone to overfitting and training instabilities when
trained from visual observations. Prior work, namely SVEA, finds that selective application
of data augmentation can improve the visual generalization of RL agents without
destabilizing training. We revisit its recipe for data augmentation, and find an
assumption that limits its effectiveness to augmentations of a photometric nature.
Addressing these limitations, we propose a generalized recipe, SADA, that works with wider
varieties of augmentations. We benchmark its effectiveness on DMC-GB2 – our
proposed extension of the popular DMControl Generalization Benchmark – as well
as tasks from Meta-World and the Distracting Control Suite, and find that our
method, SADA, greatly improves training stability and generalization of RL agents across
a diverse set of augmentations. Visualizations, code and benchmark available at: https://aalmuzairee.github.io/SADA
Submission Number: 26
Loading