Robust Multi-Agent Collaborative Perception via Spatio-Temporal Awareness

Kun Yang, Zhi Xu, Dingkang Yang, Qiang Fu, Rui Tang, Liang Song, Lihua Zhang

Published: 01 Jan 2025, Last Modified: 07 Oct 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As an emerging application in autonomous driving, multi-agent collaborative perception has recently received significant attention. Despite promising advances from previous efforts, several unavoidable challenges that cause performance bottlenecks remain, including the single-frame detection dilemma, communication redundancy, and defective collaboration process. To this end, we propose SCOPE++, a versatile collaborative perception framework aggregating spatio-temporal information across on-road agents to tackle these issues. We introduce four components in SCOPE++ for robust collaboration by seeking a reasonable trade-off between perception performance and communication bandwidth. First, we devise a context-aware information aggregation to capture valuable semantic cues in the temporal context and enhance the current local representation of the ego agent. Second, an exclusivity-aware sparse communication is introduced to filter perceptually unnecessary information from collaborators and transmit complementary features relative to the ego agent. Third, we present an importance-aware cross-agent collaboration to incorporate semantic representations of spatially critical locations across agents flexibly. Finally, a contribution-aware adaptive fusion is designed to integrate multi-source representations based on dynamic contributions. Our framework is evaluated on multiple LiDAR-based collaborative detection datasets in real-world and simulated scenarios, and comprehensive experiments show that SCOPE++ outperforms state-of-the-art methods on all datasets.