Improving End-To-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

Published: 22 Apr 2024, Last Modified: 11 May 2024VLADR 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative AI, Semantic Segmentation, Autonomous Driving
TL;DR: Improving semantic segmentation and autonomous driving models with synthetic data that preserves the semantic details of the scene while modulating its background and style for different weather and lighting conditions.
Abstract: The autonomous driving field has seen notable progress in segmentation and planning model performance, driven by extensive datasets and innovative architectures. Yet, these models often struggle when encountering rare subgroups, such as rainy conditions. Obtaining the necessary large and diverse datasets to improve generalization in these subgroups is further hindered by the high cost and effort of manual annotation. To tackle this, we introduce SynDiff-AD, a unique data generation pipeline designed to synthesize realistic images for under-represented subgroups. Our system utilizes latent diffusion models (LDMs) with meticulous text prompts to generate images from existing dataset annotations, faithfully preserving their semantic structure. This crucially eliminates the need for manual labeling. By augmenting the original dataset with images generated by our system, we demonstrate the improved performance of advanced segmentation models like Mask2Former and SegFormer by +1.4 mean Intersection over Union (mIoU). We also observe enhanced driving capabilities in end-to-end autonomous planning models like AIM-2D and AIM-BEV across diverse conditions by over 20\%. Our thorough analysis highlights that our method also contributes to overall model improvement.
Supplementary Material: pdf
Submission Number: 15
Loading