Abstract: In the field of Multi-Object Tracking (MOT), the current mainstream approach is the tracking by detection paradigm, which heavily relies on the accuracy of the detector, the comprehensiveness of feature extraction, and the superiority of the data association matching algorithm. Most existing pedestrian re-identification methods are based on convolutional neural networks (CNNs), which struggle to balance both local and global features of pedestrians. Given that current detectors are already highly advanced, this paper proposes a full-scale feature fusion-based multi-object pedestrian tracking algorithm named BOS-SORT. The algorithm utilizes the proposed feature extraction network, Better Omni-Scale Net (BOSNet), to captures both global and local appearance information, effectively reducing appearance information loss. Furthermore, it employs an improved association matching algorithm, AveSort, to combines IoU and appearance features for initial data association while smoothing target motion states to minimize matching errors in high-similarity scenarios. The BOS-SORT system integrates these methods and demonstrates exceptional capability in aligning global trajectories with real trajectories. Experimental results show that it achieves state-of-the-art Higher Order Tracking Accuracy (HOTA) scores of 66.2 and 65.3 on the MOT17 and MOT20 datasets, respectively.
Loading