Keywords: deep learning, computer vision, supervised learning, ai for social good, dataset, creative commons
TL;DR: We present Dollar Street, a supervised dataset that contains 38,479 images of everyday household items from homes around the world, including tags for objects and demographic data such as region, country and home monthly income.
Abstract: It is crucial that image datasets for computer vision are representative and contain accurate demographic information to ensure their robustness and fairness, especially for smaller subpopulations. To address this issue, we present Dollar Street - a supervised dataset that contains 38,479 images of everyday household items from homes around the world. This dataset was manually curated and fully labeled, including tags for objects (e.g. “toilet,” “toothbrush,” “stove”) and demographic data such as region, country and home monthly income. This dataset includes images from homes with no internet access and incomes as low as \$26.99 per month, visually capturing valuable socioeconomic diversity of traditionally under-represented populations. All images and data are licensed under CC-BY, permitting their use in academic and commercial work. Moreover, we show that this dataset can improve the performance of classification tasks for images of household items from lower income homes, addressing a critical need for datasets that combat bias.
Supplementary Material: pdf
Open Credentialized Access: N/A
Dataset Url: https://mlcommons.org/en/dollar-street
Dataset Embargo: N/A
License: All images and data are licensed under CC-BY 4.0, permitting their use in academic and commercial work.
Author Statement: Yes
Contribution Process Agreement: Yes
In Person Attendance: Yes