Motion-Appearance Interactive Encoding for Object Segmentation in Unconstrained Videos

Zixuan Chen, Chun-Chao Guo, Jianhuang Lai, Xiaohua Xie

2020 (modified: 14 Nov 2022)IEEE Trans. Circuits Syst. Video Technol. 2020Readers: Everyone

Abstract: We present a two-stage framework of integrating motion and appearance cues for foreground object segmentation in unconstrained videos. Unlike conventional methods which encode motion and appearance patterns individually, our method puts particular emphasis on their mutual assistance. We propose an interactively constrained encoding (ICE) scheme to incorporate motion and appearance patterns into a graph that leads to a spatiotemporal energy optimization. Specifically, we construct a saliency network to infer the initial foreground maps and use optical flow to capture the initial motion information. After that, we perform ICE in the refinement stage for object segmentation. This scheme allows our method to consistently capture structural patterns about object perceptions throughout the whole framework. Our method can be operated on superpixels instead of raw pixels to reduce the number of graph nodes by two orders of magnitude. Moreover, we propose to tackle the object localization problem with inter-occlusion by weighted bipartite graph matching. The comprehensive experiments on two benchmark datasets (i.e., SegTrack-v2 and DAVIS2016) demonstrate the effectiveness of our approach compared with the state-of-the-art methods.

0 Replies