Exploring Why Object Recognition Performance Degrades Across Income Levels and Geographies with Factor Annotations
Keywords: computer vision, fairness, robustness, machine learning, image classification, object recognition
TL;DR: We annotate Dollar Street, a geographically diverse image datasets of household objects, with factors to explain how object differ and why classifier mistakes arise across incomes and geographies.
Abstract: Despite impressive advances in object-recognition, deep learning systems’ performance degrades significantly across geographies and lower income levels---raising pressing concerns of inequity. Addressing such performance gaps remains a challenge, as little is understood about why performance degrades across incomes or geographies.
We take a step in this direction by annotating images from Dollar Street, a popular benchmark of geographically and economically diverse images, labeling each image with factors such as color, shape, and background. These annotations unlock a new granular view into how objects differ across incomes/regions. We then use these object differences to pinpoint model vulnerabilities across incomes and regions.
We study a range of modern vision models, finding that performance disparities are most associated with differences in _texture, occlusion_, and images with _darker lighting_.
We illustrate how insights from our factor labels can surface mitigations to improve models' performance disparities.
As an example, we show that mitigating a model's vulnerability to texture
can improve performance on the lower income level.
**We release all the factor annotations along with an interactive dashboard
to facilitate research into more equitable vision systems.**
Supplementary Material: pdf
Submission Number: 608
Loading