Frame Fusion with Vehicle Motion Prediction for 3D Object Detection

Published: 01 Jan 2024, Last Modified: 15 May 2025ICRA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In LiDAR-based 3D detection, history point clouds contain rich temporal information helpful for future prediction. In the same way, history detections should contribute to future detections. In this paper, we propose a detection enhancement method, namely FrameFusion, which improves 3D object detection results by fusing history detection frames. In FrameFusion, we "forward" history frames to the current frame and apply weighted Non-Maximum-Suppression on dense bounding boxes to obtain a fused frame with merged boxes. To "forward" frames, we use vehicle motion models to estimate the future pose of the bounding boxes. Our method is flexible in motion model selection. We explore three motion models in our work and show how the unicycle model and the bicycle model improve turning cases. On Waymo Open Dataset, our FrameFusion method consistently improves the performance of various 3D detectors by about 2.0 vehicle LEVEL 2 APH with negligible latency and slightly enhances the performance of the temporal fusion method MPPNet. We also conduct extensive experiments on motion model selection.
Loading