Keywords: deep reinforcement learning, generalization, robotics
TL;DR: We introduce MaDi, an actor-critic algorithm which Masks Distractions to significantly boost generalization on DMControl-GB, the Distracting Control Suite, and a real UR5 robot.
Abstract: The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks with additional loss functions. We introduce MaDi, a novel algorithm that learns to mask distractions by the reward signal only. In MaDi, the conventional actor-critic structure of deep reinforcement learning agents is complemented by a small third sibling, the Masker. This lightweight neural network generates a mask to determine what the actor and critic receive, such that they can focus on learning the task. We run experiments on the DeepMind Control Generalization Benchmark, the Distracting Control Suite, and a real UR5 Robotic Arm. Our algorithm improves the agent’s focus with useful masks, while its efficient Masker network only adds 0.2% more parameters to the original structure, in contrast to previous work. MaDi consistently achieves generalization results better than or competitive to state-of-the-art methods.
Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.
Serve As Reviewer: ~Bram_Grooten1
Track: Fast Track: published work
Publication Link: https://www.ifaamas.org/Proceedings/aamas2024/pdfs/p733.pdf
Submission Number: 48
Loading