Multi-object tracking with scale-aware transformer and enhanced association strategy

Published: 01 Jan 2025, Last Modified: 08 Apr 2025Multim. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multiple Object Tracking (MOT) is an important task in computer vision. The effectiveness of existing MOT methods can be easily hurt by multi-scale objects and extreme occlusions in the tracking process. To effectively improve the tracking performance of multi-scale objects, we propose a Scale-aware Transformer module integrating multi-scale global information into feature maps of various scales to augment the representation, which can improve detection and ID embedding tasks to help data association without heavy computation cost. For false associations caused by occluded objects, an enhanced association strategy is proposed which combines detection recovery mechanism and detection confidence hierarchical matching into JDE-based association, and a new CIoU-Embedding similarity matrix is re-designed for matching. The results on MOT16, MOT17, and MOT20 benchmarks indicate that our method achieves competitive performance among JDE-based models and Transformer-based methods, e.g., 77.5 MOTA and 74.9 IDF1 on MOT17 while maintaining 19.5 FPS tracking speed.
Loading