Abstract: A single target tracker based on a Siamese network regards tracking as a process of similarity matching. The convolution features of the template branch and search area branch realize similarity matching and information fusion by a correlation operation. However, the correlation operation is a local linear matching, which limits the tracker to capturing the complex nonlinear relationship between the template branch and search area branch. In addition, it is easy to lose useful information. Moreover, most trackers do not update the template. The template branch and the search area branch compute convolution features independently without information exchange. To solve these existing problems, a graph attention information fusion for Siamese adaptive attention tracking network (GIFT) is proposed. The information flow between the template branch and search area branch is connected by designing a Siamese adaptive attention module (SAA), and the template information is updated indirectly. The graph attention information fusion module (GAIF) is proposed to effectively fuse the information of the template branch and search area branch and realize the similarity matching of their corresponding parts. Layerwise aggregation makes full use of the shallow and deep features of neural networks. This further improves tracking performance. Experiments on 6 challenging benchmarks, including GOT-10k, OTB100, VOT2018, VOT2019, UAV123 and LaSOT, demonstrate that GIFT has the leading performance and runs at 28.34 FPS, which surpasses the real-time level of 25 FPS.
0 Replies
Loading