Abstract: Recent knowledge distillation (KD) for 3D object detection often involves costly LiDAR or multi-camera data. We focus on monocular camera-based 3D detectors, where missing 3D cues cause large feature gaps. To address this, we propose region-aware KD, aligning object features by matching their scales and pyramid levels. We introduce a probabilistic distribution to weigh region importance. Applied to MonoRCNN++ and MonoDETR on the KITTI and Waymo dataset, our approach achieves reduced complexity and strong performance with a lightweight backbone. Compared to recent KD methods, ours excels in both effectiveness and efficiency.
External IDs:doi:10.1016/j.icte.2025.04.012
Loading