Abstract: Highlights•MCFTNet is a multi-branch RGB-TIR fusion network enabling robust RGBT tracking.•CMDIF performs cross-modal fusion to mine RGB-TIR complementarity for tracking.•MAAA adaptively aggregates multimodal features to improve tracking perception.•Hybrid attention improves feature extraction for accurate target localization.•Advanced results are achieved on GTOT, RGBT234, and LasHeR benchmarks.
External IDs:dblp:journals/asc/LiuLSMLW25
Loading