Keywords: collaborative perception, autonomous driving
Abstract: Collaborative perception expands the perception range by sharing information among agents, effectively improving task performance. Immutable heterogeneity poses a significant challenge in collaborative perception, as participating agents may employ different and fixed perception models. This leads to domain gaps in the intermediate features shared among agents, consequently degrading collaborative performance.
Aligning the features of all agents to a common representation can eliminate domain gaps with low training cost. However, in existing methods, the common representation is designated as the representation of a specific agent, making it difficult for agents with significant domain discrepancies from this specific agent to achieve proper alignment.
This paper proposes NegoCollab, a heterogeneous collaboration method based on negotiated common representation. It achieves bidirectional transformation of each modality's features between local representation space and common representation space through paired sender-receiver, thereby eliminating domain gaps. The common representation in NegoCollab is negotiated from local representations of each modality's agent via a negotiator introduced during training, effectively reducing inherent domain discrepancies with each local representation. Furthermore, to better align local representations with the multimodal common representation, we introduce both structural alignment loss and pragmatic alignment loss alongside the conventional distribution alignment loss during supervised training, enabling comprehensive knowledge distillation from the common representation to the senders.
The experimental results demonstrate that NegoCollab significantly outperforms existing methods in common representation-based collaboration approaches. The negotiation-based mechanism for acquiring common representations provides more diverse and reliable alternatives for establishing common representations required in heterogeneous collaboration perception.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 19938
Loading