Conditioned Generative AI for Synthetic Training of 6D Object Pose Detection

Published: 2025, Last Modified: 27 Sept 2025VISIGRAPP (2): VISAPP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this paper, we propose a method to generate synthetic training images for a more complex computer vision task compared to image classification, specifically 6D object pose detection. We demonstrate that conditioned diffusion models can generate unlimited training images for training an object pose detection model for a custom object type. Moreover, we investigate the potential of (automatically) filtering out ill-produced images in the dataset, which increases the quality of the image dataset, and show the importance of finetuning the trained model with a limited amount of real-world images to bridge the remaining sim2real domain gap. We demonstrate our pipeline in the use case of parcel box detection for the automation of delivery vans. All code is publicly available on our GitLab https://gitlab.com/EAVISE/avc/generative-ai-synthetic-training-pose-detection.
Loading