Abstract: Age estimation is a hot and challenging research topic in the computer vision community. Several facial datasets annotated with age and gender attributes became available in recent years. However, the statistical information of these datasets reveal the unbalanced label distribution which inevitably introduce bias during model training. In this work, we manually collect and label a large-scale age dataset called Real Scenario Face Age Dataset (RSFAD) which contains 85,044 facial images captured from surveillance cameras in the wild. Due to the COVID-19, we not only label the apparent age group and gender but also label the breathing mask, and the label distribution of RSFAD dataset is almost uniform which is the first age dataset to the best of our knowledge. In addition, we investigate the impact of age, gender and mask distribution on age group estimation by comparing GDEX CNN model trained on several different datasets. Our experiments show that the RSFAD dataset has good performance for age estimation task and also it is suitable for being an evaluation benchmark.
Loading