Abstract: We propose a novel online Attentional Recurrent Neural Network (ARNN) model for visual tracking, which exploits the feature maps of Convolutional Neural Network (CNN) inside a bounding box to identify whether this target is the one appeared in previous frames. Attention mechanism is adopted for both different parts of targets and different scales of object features. The former attention model is able to select important regions to better trace the target while the latter one learns to weight the multiple scale features for accurate object location. We jointly train the recurrent network with the region based and scale based attention mechanism. The outstanding performances in the experiments validate the effectiveness of our proposed ARNN and show that ARNN outperforms the state-of-the-art tracking methods.
0 Replies
Loading