Keywords: Instance Augmentation, 3D Perception, Autonomous Driving
Abstract: 3D object detection is essential for autonomous driving but remains limited by the long-tail distribution of real-world data.
Instance-level augmentation methods, whether copy-paste or asset rendering, are typically restricted to LiDAR and offer only modest variation with limited scene context.
We introduce MAPLE, a training-free pipeline for multimodal augmentation that generates synchronized RGB-LiDAR pairs.
Objects are inserted through context-aware inpainting in the image domain, and paired pseudo-LiDAR is reconstructed via depth estimation.
To ensure cross-modal plausibility, MAPLE incorporates semantic and geometric verification modules that filter inconsistent generations.
We further propose a success-rate evaluation that quantifies error reduction across verification stages, providing a principled measure of pipeline reliability.
On the nuScenes benchmark, MAPLE consistently improves both detection and segmentation in multimodal and LiDAR-only settings.
Code will be released publicly.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 12953
Loading