CLDTracker: A Comprehensive Language Description for visual Tracking

Mohamad Alansari, Sajid Javed, Iyyakutti Iyappan Ganapathi, Sara Alansari, Muzammal Naseer

Published: 01 Dec 2025, Last Modified: 05 Nov 2025Information FusionEveryoneRevisionsCC BY-SA 4.0

Abstract: Highlights•Introduce comprehensive bag of textual descriptions for VOT tracking.•Provide a comprehensive bag of textual descriptions for six VOT datasets.•Propose TTFUM to update target text features over time.•Fuse visual and textual features using attention-based correlation.•Evaluate CLDTrack on six benchmarks against 38 SOTA trackers.

External IDs:doi:10.1016/j.inffus.2025.103374