KidSat: satellite imagery to map childhood poverty dataset and benchmark

Makkunda Sharma; Fan Yang; Duy-Nhat Vo; Esra Suel; Swapnil Mishra; Samir Bhatt; Oliver Fiala; William Rudgard; Seth Flaxman

KidSat: satellite imagery to map childhood poverty dataset and benchmark

Makkunda Sharma, Fan Yang, Duy-Nhat Vo, Esra Suel, Swapnil Mishra, Samir Bhatt, Oliver Fiala, William Rudgard, Seth Flaxman

29 May 2024 (modified: 13 Nov 2024)Submitted to NeurIPS 2024 Track Datasets and BenchmarksEveryoneRevisionsBibTeXCC BY 4.0

Keywords: satellite imagery, remote sensing, self-supervised learning, social science, global health, economic, health and development indicators

TL;DR: We present a new dataset and benchmark consisting of satellite images and corresponding child poverty indicators in Eastern and Southern Africa

Abstract: Satellite imagery has emerged as an important tool to analyse demographic, health, and development indicators. While various deep learning models have been built for these tasks, each is specific to a particular problem, with no standard benchmarks available. We propose a new dataset pairing satellite imagery and high-quality survey data on child poverty to benchmark satellite feature representations. Our dataset consists of 33,608 images, each 10 km $\times$ 10 km, from 19 countries in Eastern and Southern Africa in the time period 1997-2022. As defined by UNICEF, multidimensional child poverty covers six dimensions and it can be calculated from the face-to-face Demographic and Health Surveys (DHS) Program. As part of the benchmark, we test spatial as well as temporal generalization, by testing on unseen locations, and on data after the training years. Using our dataset we benchmark multiple models, from low-level satellite imagery models such as MOSAIKS, to deep learning foundation models, which include both generic vision models such as Self-Distillation with no Labels (DINOv2) models and specific satellite imagery models such as SatMAE. We provide open source code for building the satellite dataset, obtaining ground truth data from DHS and running various models assessed in our work.

Supplementary Material: pdf

Submission Number: 1749

Loading