BEVSync: Asynchronous Data Alignment for Camera-based Vehicle-Infrastructure Cooperative Perception Under Uncertain Delays

Published: 01 Jan 2025, Last Modified: 16 Sept 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Vehicle-to-infrastructure (V2I) cooperative perception systems can enhance the sensing abilities of autonomous vehicles. Existing V2I solutions often consider LiDARs devices instead of cameras, the most prevalent sensors with low cost and wide installation. In addition, a major challenge that has been underexplored is the time asynchrony between image frames from different sources. This asynchrony arises because of clock differences, varying times involved in data processing and transmission, causing uncertain delays that complicate data alignment and potentially reduce perception accuracy. We propose BEVSync, a camera-based V2I cooperative perception system that adaptively aligns frames from the ego-vehicle and infrastructure by compensating for motion deviations. Specifically, we develop an extractor-compensator model to extract and predict perceptual features using historical frames, thereby smoothing out the data misalignment. Experiments on the real-world dataset DAIR-V2X show that our approach surpasses existing methods in terms of performance and robustness.
Loading