Keywords: Computer Vision, Synthetic Data Generation, Transfer Learning, Ethical AI, Height Prediction
TL;DR: We created synthetic images with a generative algorithm to pre-train and evaluate computer vision models for early childhood height prediction, tackling data scarcity and ethical issues.
Abstract: While computer vision approaches have demonstrated success in various image-based tasks, they face challenges with early childhood height prediction for malnutrition detection due to a scarcity of publicly available training data. However, building public datasets for training and benchmarking machine learning models for this task is difficult because of the sensitive nature of the images.
Although synthetic data have been employed in other data-scarce machine learning tasks, they do not exist for predicting children's height.
In this work, we develop a novel generative algorithm to create synthetic images (including depth maps, segmentation maps, and keypoints) with non-photorealistic human figures, thereby providing an ethical and scalable solution to pre-train and evaluate computer vision models in a controlled setting. Our synthetic dataset models a wide variety of key real-world variables such as physical proportions, lighting, and posture.
We demonstrate the potential of our dataset in a transfer learning setting by showing that models pre-trained on our synthetic data outperform baseline approaches when applied to real-world prediction tasks.
Submission Number: 210
Loading