Abstract: We propose a novel and comprehensive general-purpose object tracking system named Self-supervised Centric Open-set Object Tracking or ‘SCOOT’. Our SCOOT encompasses a self-supervised appearance model, a fusion module for combining textual and visual features, and an object association algorithm based on reconstruction and observation. Through this system, we unlock new possibilities for enhancing the capability of open-set object tracking with the aid of language cues in real-world scenarios.
Loading