SpikeMOT: Event-Based Multi-Object Tracking With Sparse Motion Features

Published: 01 Jan 2025, Last Modified: 04 Nov 2025IEEE Access 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In comparison to conventional RGB cameras, the exceptional temporal resolution of event cameras allows them to capture rich information between frames, making them prime candidates for object tracking. Yet in practice, despite their theoretical advantages, the body of work on event-based multi-object tracking (MOT) remains in its infancy, especially in real-world environments where events from complex background and camera motion can easily obscure the true target motion. To address these limitations, we introduce SpikeMOT, an innovative event-based MOT framework employing spiking neural networks (SNNs) within a Siamese architecture. SpikeMOT extracts and associates sparse spatiotemporal features from event streams, enabling high-frequency object motion inference while preserving object identities. Additionally, a simultaneous object detector provides updated spatial information of these objects at an equivalent frame rate. To evaluate the efficacy of SpikeMOT, we present DSEC-MOT, a meticulously constructed, real-world event-based MOT benchmark. This dataset features manually corrected annotations for objects experiencing severe occlusions, frequent intersections, and out-of-view scenarios commonly encountered in real-world applications. Extensive experiments on the DSEC-MOT and the FE240hz dataset demonstrate SpikeMOT’s superior tracking accuracy under demanding conditions, advancing the state-of-the-art in event-based multi-object tracking.
Loading