Datasets with Rich Labels for Machine Learning

Published: 01 Jan 2023, Last Modified: 14 May 2025FUZZ 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most datasets used for classification use hard labels. In this paper, five new datasets labeled with uncertainty and imprecision by crowdsourcing contributors are presented. Richer labels are modeled with the theory of belief functions, which generalizes several reasoning frameworks with uncertainty, such as possibilities or probabilities. These datasets can be used with classical models using hard labels but also with probabilistic, fuzzy or even evidential models. Several concrete application cases are presented, for which these new datasets provide a useful knowledge representation of the user's uncertainty.
Loading