Abstract: Multi-object tracking in traffic videos is a crucial research area, offering immense potential for enhancing traffic monitoring accuracy and promoting road safety measures through the utilisation of advanced machine learning algorithms. However, existing datasets for multi-object tracking in traffic videos often feature limited instances or focus on single classes, which cannot well simulate the challenges encountered in complex traffic scenarios. To address this gap, we introduce TrafficMOT, an extensive dataset designed to encompass diverse traffic situations with complex scenarios. To validate the complexity and challenges presented by TrafficMOT, we conducted comprehensive empirical studies using three different settings: fully-supervised, semi-supervised, and a recent powerful zero-shot foundation model Tracking Anything Model (TAM). The experimental results highlight the inherent complexity of this dataset, emphasising its value in driving advancements in the field of traffic monitoring and multi-object tracking.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Systems] Transport and Delivery
Relevance To Conference: This work significantly contributes to the fields of multimedia and multimodal processing by presenting a rich dataset tailored for advancing multi-object tracking (MOT) technologies in densely populated traffic environments. It addresses a critical gap in existing datasets that often lack diversity in scenarios, object types, and environmental conditions, thereby limiting the development and evaluation of robust MOT algorithms. By incorporating a wide range of complex traffic scenarios, including varying weather conditions, times of day, and high-density traffic flows, this work facilitates the development of algorithms that can effectively integrate and analyze multimodal data sources—such as video, radar, and lidar inputs—to achieve accurate, real-time tracking of multiple objects.
Submission Number: 2787
Loading