Abstract: Long-term tracking algorithms need to track targets stably in long videos. Compared with short-term tracking, long-term tracking faces more complex challenges. Such as the background brightness distributed in the entire gray level, wide range of target scale changes, more frequent motion blur and occlusion and other challenges. Existing long-term tracking algorithms cannot solve these challenges well, and easily cause tracking box drift, which significantly decrease the accuracy and robustness of the algorithms. In this paper, we design a detection and tracking combined network that can implement a global instance search for the tracked target. We integrate deformable convolution into the siamese network to solve the target deformation problem during long-term tracking. Besides, guided anchor is adopted in RPN to generate more sparse and accurate proposals, which can decrease the interference of the background. We design the cascaded RCNN with template information to filter out proposals and refine coordinates. And finally, select the highest confidence proposal as the final tracking box. Experiments show that our method has a stronger discriminative ability and can get more accurate tracking boxes compared with other advanced trackers. On several long-term tracking benchmarks, our method has achieved excellent performance.
0 Replies
Loading