Keywords: End-to-End; Multiple Object Tracking; Transformer
Abstract: Although end-to-end multi-object trackers like MOTR enjoy the merits of simplicity, they suffer from the conflict between detection and association, resulting in unsatisfactory convergence dynamics. While MOTRv2 partly addresses this problem, it demands an additional detector. In this work, we serve as the first to reveal this conflict arises from unfair label assignment between detect and track queries, where detect queries are responsible for recognizing newly appearing targets and track queries are to associate them in following frames. Based on this observation, we propose MOTRv3, which balances the label assignment using the proposed release-fetch supervision strategy. In this strategy, labels are first released for detection and gradually fetched back for association. Besides, another two strategies named pseudo label distillation and track group denoising are designed to further strengthen the supervision for detection and association. Without extra detector during inference, MOTRv3 achieves impressive performance across diverse benchmarks, showing scaling up capability.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 815
Loading