Abstract: In this paper, we address the problem of motion modeling in the Multi-Object Tracking (MOT) task. We present an unsupervised probabilistic motion model and associated estimation algorithm based on a dynamical variational autoencoder (DVAE), called DVAE-UMOT. The DVAE is a latent-variable deep generative model used to model long-term and non-linear temporal dependencies in sequential data. Thanks to the powerful sequential modeling capacity of the DVAE, DVAE-UMOT is able to keep long-term tracks purely based on motion cues under the tracking-by-detection paradigm and generate reasonable bounding boxes when there is detection absence. Experimental results show that our model is particularly good at handling object disappearance and identity switch problems caused by long-term occlusion and unstable detections. Finally, DVAE-UMOT is shown experimentally to compete well with and even surpass the performance of two state-of-the-art probabilistic MOT motion models. Code and data are publicly available.
0 Replies
Loading