DICES Dataset: Diversity in Conversational AI Evaluation for Safety

NeurIPS 2023 Track Datasets and Benchmarks Submission499 Authors

Published: 26 Sept 2023, Last Modified: 02 Feb 2024NeurIPS 2023 Datasets and Benchmarks PosterEveryoneRevisionsBibTeX
Keywords: conversational AI, human evaluation, human annotation, safety task, disagreement, variance in human annotations, diversity of rater pool
TL;DR: Safe AI is a better AI for Everyone. But everyone has different perceptions of safety. The DICES dataset offers a shared resource to understand these differences and benchmark safety evaluation for conversational AI systems
Abstract: Machine learning approaches often require training and evaluation datasets with a clear separation between positive and negative examples. This requirement overly simplifies the natural subjectivity present in many tasks, and obscures the inherent diversity in human perceptions and opinions about many content items. Preserving the variance in content and diversity in human perceptions in datasets is often quite expensive and laborious. This is especially troubling when building safety datasets for conversational AI systems, as safety is socio-culturally situated in this context. To demonstrate this crucial aspect of conversational AI safety, and to facilitate in-depth model performance analyses, we introduce the DICES (Diversity In Conversational AI Evaluation for Safety) dataset that contains fine-grained demographics information about raters, high replication of ratings per item to ensure statistical power for analyses, and encodes rater votes as distributions across different demographics to allow for in-depth explorations of different aggregation strategies. The DICES dataset enables the observation and measurement of variance, ambiguity, and diversity in the context of safety for conversational AI. We further describe a set of metrics that show how rater diversity influences safety perception across different geographic regions, ethnicity groups, age groups, and genders. The goal of the DICES dataset is to be used as a shared resource and benchmark that respects diverse perspectives during safety evaluation of conversational AI systems.
Supplementary Material: pdf
Submission Number: 499