Uncertainty-Calibrated Pseudo-Labeling and Graph-Based Feature Alignment for Extremely Sparse Annotation in Cooperative 3D Detection
Abstract: Building accurate cooperative 3D detectors typically requires extensive, consistently aligned labels across agents and views, making annotation cost prohibitive in V2X settings. We investigate cooperative 3D object detection when supervision is nearly absent—only one or two labeled instances per scene—where naïve pseudo-labeling is brittle and standard regularization easily overfits. We present a semi-supervised cooperative learning framework featuring two new ingredients: (1) uncertainty-calibrated pseudo-labeling, where a Multi-level Guidance teacher model estimates localization and classification uncertainty to adaptively threshold pseudo boxes and weight losses, and (2) graph-based feature alignment across agents, constructing a collaboration graph from spatial and confidence cues to distill relational knowledge (node and edge embeddings) from teacher to student. Strong Augmentation Alignment are applied to encourage robustness, while the graph distillation explicitly stabilizes cross-agent fusion under missing labels. Extensive evaluations on OPV2V and DAIR-V2X under one-shot and two-shot protocols demonstrate consistent gains over competitive semi-supervised cooperative baselines, reducing the reliance of collaborative perception on densely annotated multi-agent datasets.
Loading