UTFC-DiffTracker: Short- and Long-Range Temporal Feature Consistency Diffusion for Underwater Object Tracking
Keywords: Underwater Object Tracking, Feature Correction, Temporal Consistency
TL;DR: UTFC-DiffTracker proposes a feature-aligned consistency framework with short- and long-range temporal feature alignment to address degradation and inconsistency in underwater object tracking.
Abstract: Underwater object tracking (UOT) plays a significant role in marine animal protection, underwater search and rescue, and maritime security, yet faces distinctive challenges including color distortion, low visibility, similar distractors, and occlusion in complex environments. Existing approaches include frame-level trackers that employ enhancement-based or adaptation strategies, processing frames independently and leading to inconsistent feature styles and weakened temporal correlations. Video-level trackers leverage autoregressive mechanisms for temporal consistency but still struggle with persistent feature degradation and tracking drift in underwater environments. To overcome these limitations, this paper proposes UTFC-DiffTracker, the feature \& video-level tracker that achieves spatiotemporal feature alignment. The framework integrates two core innovations: the Short-Range Temporal Feature Consistency integrates diffusion-based correction and dynamic style memory retention to resolve underwater feature degradation while maintaining temporal coherence; the Long-Range Temporal Feature Consistency enhances discrimination against distractors and occlusion through wavelet decomposition that separates historical tokens into stable structures and transient details. UTFC-DiffTracker achieves state-of-the-art performance on four UOT benchmarks while preserving semantic integrity and ensuring tracking reliability.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 17739
Loading