Sparse-FS3D: A Sparse-Feature Fusion Approach for Diffusion-Enhanced Few-Shot 3D Object Detection in Outdoor Scenes

Sandesh Rajendra Jain, Heesang Han, A. Lynn Abbott, Abhijit Sarkar

Published: 2025, Last Modified: 14 Jun 2026ITSC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Few-shot 3D object detection (FS3D) has gained increasing attention due to its potential to handle limited annotated data for 3D perception tasks. However, existing methods have predominantly focused on indoor environments with dense point clouds, leaving outdoor settings with sparse and large-scale point clouds largely underexplored. In this paper, we introduce SparseFS3D, a novel approach to few-shot 3D object detection tailored for outdoor LiDAR scenes. Our method addresses the challenges of sparse point clouds through a combination of adaptive voxelization, Krylov subspace-based farthest point sampling (FPS), and cross-attention feature fusion. By applying finer voxel sizes to dense regions and coarser ones to sparse regions, we optimize the computational load while maintaining critical features. Furthermore, we introduce a diffusion-based generative augmentation technique that produces realistic object point clouds for novel classes to offset limited sample availability in the few-shot split. We establish a new benchmark on the KITTI and NuScenes datasets for outdoor FS3D, demonstrating that our approach significantly improves mAP performance in challenging real-world settings with fewer labeled samples. Our results indicate that the combination of adaptive voxelization, specialized generative sampling integration via diffusion-based augmentation, provides an effective framework for few-shot learning and scalable outdoor 3D object detection, bridging the gap between sparse data and high-performance detection in autonomous driving and other large-scale 3D perception tasks. Our code is publicly available at https://github.com/sandeshrjain/sparse-fs3d.
Loading