Keywords: reinforcement learning, rl, visual reinforcement learning, robot learning, robot manipulation, representation learning, augmentation, augmentations reinforcement learning, multi view reinforcement learning, multi view robot learning
Abstract: Vision is well-known for its use in manipulation, especially using visual servoing. To make it robust, multiple cameras are needed to expand the field of view. That is computationally challenging. Merging multiple views and using Q-learning allows the design of more effective representations and optimization of sample efficiency. Such a solution might be expensive to deploy. To mitigate this, we introduce a merge and disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while augmenting with single-view features to allow lightweight deployment and ensure robust policies. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3.
Spotlight: mp4
Submission Number: 719
Loading