Distractor-Aware Video Object Segmentation

Andreas Robinson, Abdelrahman Eldesokey, Michael Felsberg

2021 (modified: 09 Nov 2022)GCPR 2021Readers: Everyone

Abstract: Semi-supervised video object segmentation is a challenging task that aims to segment a target throughout a video sequence given an initial mask at the first frame. Discriminative approaches have demonstrated competitive performance on this task at a sensible complexity. These approaches typically formulate the problem as a one-versus-one classification between the target and the background. However, in reality, a video sequence usually encompasses a target, background, and possibly other distracting objects. Those objects increase the risk of introducing false positives, especially if they share visual similarities with the target. Therefore, it is more effective to separate distractors from the background, and handle them independently. We propose a one-versus-many scheme to address this situation by separating distractors into their own class. This separation allows imposing special attention to challenging regions that are most likely to degrade the performance. We demonstrate the prominence of this formulation by modifying the learning-what-to-learn [3] method to be distractor-aware. Our proposed approach sets a new state-of-the-art on the DAVIS 2017 validation dataset, and improves over the baseline on the DAVIS 2017 test-dev benchmark by 4.6% points.

0 Replies