Abstract: Highlights•Distant Temporal Dependencies capture long-range temporal dependencies.•Saliency-guided Relevance Weighting emphasizes salient frames and regions.•DINOv2 Perception Enhancement improves cross-view semantic feature coherence.
External IDs:dblp:journals/ijon/LiHWHHJT26
Loading