Tsallis Entropy Based Labelling

Kentaro Goto, Masato Uchida

Published: 2020, Last Modified: 13 May 2025ICMLA 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the field of supervised classification, the quality of training data is an essential aspect of accurate learning, along with the selection of a learning algorithm or parameters optimisation. To improve the quality of training data, it is necessary to reflect an annotator's idea on to which class any given instance belongs in the form of a label as flexibly and accurately as possible. However, in conventional problem settings used in machine learning, the number of labels per instance is uniformly fixed at a certain value, and it is implicitly assumed that annotators provide labels under such a constraint. Thus, in this study, we propose an annotation framework; Tsallis entropy based labelling, which models a method that dynamically selects the number of labels for every single given instance depending on the uncertainty regarding the class to which each instance belongs. Using the proposed framework, an annotator's instinctive uncertainty about classification task is expressed based on the Tsallis entropy and Tsallis self-information. In addition, the proposed framework has a well-organised mathematical structure that includes some typical annotation models. We conduct an experiment to evaluate the proposed framework and demonstrate that it outperforms another annotation model in terms of the labels accuracy; In the comparison model, the number of labels per instance is deliberately set at a fixed value for all instances. Moreover, we exemplify that the conventional single labelling scheme is not always the best option, which reveals the fact that increasing the number of labels per instance does not necessarily hinder the labels accuracy.