Abstract: Understanding the potential for downstream harms from data-intensive technologies requires strong collaboration across disciplines and with the public. Having shared vocabularies of concerns reduces the communication barriers inherent in this work. The Data Hazards project [url] contains an open-source, controlled vocabulary of 11 hazards associated with data science work, presented as ‘labels’. Each label has (i) an icon, (ii) a description, (iii) examples, and, crucially, (iv) suggested safety precautions. A reflective discussion format and resources have also been developed. These have been created over three years with feedback from interdisciplinary contributors, and their use evaluated by participants (N=47). The labels include concerns often out-of-scope for ethics committees, like environmental impact. The resources can be used as a structure for interdisciplinary harms discovery work, for communicating hazards, collecting public input or in educational settings. Future versions of the project will develop through feedback from open-source contributions, methodological research and outreach.
External IDs:doi:10.1016/j.jrt.2025.100110
Loading