Abstract: Multi-Object Tracking (MOT) and Person Search both demand to localize and identify specific targets from raw image frames. Existing methods can be classified into two categories, namely two-step strategy and end-to-end strategy. Two-step approaches have high accuracy but suffer from costly computations, while end-to-end methods show greater efficiency with limited performance. In this paper, we dissect the gap between two-step and end-to-end strategy and propose a simple yet effective end-to-end framework with knowledge distillation. Our proposed framework is simple in concept and easy to benefit from external datasets. Experimental results demonstrate that our model performs competitively with other sophisticated two-step and end-to-end methods in multi-object tracking and person search.
0 Replies
Loading