Robust unsupervised visual tracking via image-to-video identity knowledge transferring

Bin Kang, Zongyu Wang, Dong Liang, Tianyu Ding, Songlin Du

Published: 2026, Last Modified: 11 Nov 2025Pattern Recognit. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•This paper presents a novel approach, Autonomous Unsupervised Identification Tracking (AUDI-T), which leverages image-level target identity information to significantly reduce the reliance on extensive video annotations for target identity. To the best of our knowledge, this is the first effort to incorporate target identification in UVT.•This paper employs a meta-training strategy that synergies two aspects: key-frame identity knowledge alignment and all-frame identity knowledge regularization. By integrating them into an image-to-video transfer framework, we not only improve immediate target identification accuracy but also ensure robust tracking performance over longer sequences.•This paper conducts extensive experiments to evaluate the effectiveness of our method across various backbones, training strategies, and tracking frameworks. Results consistently demonstrate the versatility and superiority of AUDI-T over existing methods.

External IDs:dblp:journals/pr/KangWLDD26