The Human Labour of Data Work: Capturing Cultural Diversity through World Wide Dishes

Published: 01 Jan 2025, Last Modified: 19 May 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper provides guidance for building and maintaining infrastructure for participatory AI efforts by sharing reflections on building World Wide Dishes (WWD), a bottom-up, community-led image and text dataset of culinary dishes and associated cultural customs. We present WWD as an example of participatory dataset creation, where community members both guide the design of the research process and contribute to the crowdsourced dataset. This approach incorporates localised expertise and knowledge to address the limitations of web-scraped Internet datasets acknowledged in the Participatory AI discourse. We show that our approach can result in curated, high-quality data that supports decentralised contributions from communities that do not typically contribute to datasets due to a variety of systemic factors. Our project demonstrates the importance of participatory mediators in supporting community engagement by identifying the kinds of labour they performed to make WWD possible. We surface three dimensions of labour performed by participatory mediators that are crucial for participatory dataset construction: building trust with community members, making participation accessible, and contextualising community values to support meaningful data collection. Drawing on our findings, we put forth five lessons for building infrastructure to support future participatory AI efforts.
Loading