Turath-150K: Image Database of Arab HeritageDownload PDF

07 Jun 2021 (modified: 22 Oct 2023)Submitted to NeurIPS 2021 Datasets and Benchmarks Track (Round 1)Readers: Everyone
Keywords: Image database, heritage, cultural diversity
TL;DR: We curate Turath-150K, a database of images that reflect objects, activities, and scenarios commonly found in the Arab world.
Abstract: Large-scale image databases remain largely biased towards objects and activities encountered in a select few cultures. This absence of culturally-diverse images, which we refer to as the \enquote{hidden tail}, limits the applicability of pre-trained neural networks and inadvertently excludes researchers from under-represented regions. To begin remedying this issue, we curate Turath-150K, a database of images of the Arab world that reflect objects, activities, and scenarios commonly found there. In the process, we introduce three benchmark databases, Turath Standard, Art, and UNESCO, specialised subsets of the Turath dataset. After demonstrating the limitations of existing networks pre-trained on ImageNet when deployed on such benchmarks, we train and evaluate several networks on the task of image classification. As a consequence of Turath, we hope to engage machine learning researchers in under-represented regions, and to inspire the release of additional culture-focused databases. The database can be accessed here: \url{danikiyasseh.github.io/Turath}.
Supplementary Material: zip
URL: danikiyasseh.github.io/Turath
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2201.00220/code)
4 Replies

Loading