Impacts of Synthetically Generated Data on Trackformer-based Multi-Object Tracking

Matthew Lee, Clayton A. Harper, William Flinchbaugh, Eric C. Larson, Mitchell A. Thornton

Published: 2023, Last Modified: 18 Jun 2024AIPR 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As the scale of deep learning tasks continues to expand, the generation of sufficiently large datasets has become increasingly costly and time-consuming. In particular, resource demands for manual annotations for computer vision tasks such as multi-object tracking have contributed to the growing popularity of synthetic computer vision datasets created through simulation engines. Simulations facilitate the creation of automatically annotated datasets with complete control over environmental variables that are typically uncontrollable in real-world scenarios. Leveraging this control, we generate multi-object tracking datasets isolating specific environmental variables including subject scale, camera movement, and lighting changes. Our evaluation focuses on the TrackFormer architecture, an end-to-end, transformer-based solution designed for multi-object tracking. The resulting insights into how each environmental variable affects multi-object tracking performance can guide future architectural improvements. Furthermore, our data generation process can serve as a template for evaluating deep learning architectures in simulated environments.