Multi-granularity Hierarchical Attention Siamese Network for Visual TrackingDownload PDFOpen Website

Published: 2018, Last Modified: 17 May 2023IJCNN 2018Readers: Everyone
Abstract: Speed and accuracy are the two most important focuses for many visual tracking methods. Recently, siamese networks based trackers have shown very promising potentials in both aspects, which develop a twin network to measure the responses between target and hypotheses with a fully convolutional operation. However, the learned response maps are vulnerable to background clutters and scale changes as they ignore priori knowledge such as the object salience and multi-granularity cues. To explore the benefits of priori, this paper devises a multi-granularity hierarchical attention siamese network tracker (MHA-Siam) to further enhance the tracking stability without sacrificing real-time speed. Particularly, the channel-wise attention mechanism is exploited here to filter out the background while remain the salient object region; then, the response maps of the coarse-to-finer multi-layer features are fused to capture multi-granularity location information helpful for improvement in tracking stability. To make full use of them, MHA-Siam imposes the element-wise max-and-sum operation on them to induce a reliable response map for accurate location. Experiments of visual tracking on OTB benchmark shows the superiority of MHA-Siam with the competitive efficiency to its counterpart trackers.
0 Replies

Loading