A schistosomiasis dataset with bright- and darkfield images

Published: 26 Sept 2024, Last Modified: 10 Sept 2025MICCAI 2024 Open Data WorkshopEveryoneCC BY 4.0
Abstract: Schistosomiasis is a neglected tropical disease that threatens 700 million and impacts 250 million people per year. The disease is caused by blood flukes of the genus Schistosoma, which enter the human body through contact with infected water. One species, S. haematobium, sheds eggs through the urinary tract, and can thus be diagnosed by examining urine samples for these eggs. Because concentrations of schistosomiasis infection are highly localized and are often in remote areas, rapid and robust field diagnosis is crucial to both individual diagnosis and the mapping that informs control efforts. AI algorithms, if properly designed, can speed up and improve both diagnosis and mapping through scalable, accurate analysis of images of urine samples. To develop such algorithms, we offer the dataset described here. It consists of paired bright- and darkfield images of urine samples collected in two distinct field studies in Cote d’Ivoire, Africa. There are images from 725 patients, of whom 150 were schisto-positive and contain S. haematobium eggs. Crucially, each patient has sufficient images to diagnose S. haematobium infection, so the dataset can be used to realistically test the diagnostic value of algorithms for clinical use. The division into two studies allow testing of algorithm generalizability. Due to exigencies of the data collection protocol, the images display a variety of qualities, from clear to blurry, which further allows testing of algorithm robustness to realistic noise. The dataset is thus well-suited to developing algorithms that can be of concrete value in schistosomiasis control efforts.
Loading