Abstract: Acquiring annotated training data for large-scale supermarket product recognition applications is challenging and often infeasible due to the vast and dynamic product assortments containing tens of thousands of products. To address this problem, we propose a highly scalable data synthesis pipeline that can automatically produce realistic, domain-aligned training data for on-shelf product detectors and classifiers. Additionally, we present three new publicly available synthetic datasets generated by our pipeline. Among them is the SPS8k dataset, featuring 16,224 shelf images with 1,981,967 instance-level bounding boxes and GTIN class labels for 8,112 grocery products. Finally, in a comprehensive ablation study, we evaluate the effects of synthetic-to-real domain translation on model performance, demonstrating its effectiveness.
Loading