No Need for Ad-hoc Substitutes: The Expected Cost is a Principled All-purpose Classification Metric

TMLR Paper2765 Authors

28 May 2024 (modified: 17 Sept 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The expected cost (EC) is one of the main classification metrics introduced in statistical and machine learning books. It is based on the assumption that, for a given application of interest, each decision made by the system has a corresponding cost which depends on the true class of the sample. An evaluation metric can then be defined by taking the expectation of the cost over the data. Two special cases of the EC are widely used in the machine learning literature: the error rate (one minus the accuracy) and the balanced error rate (one minus the balanced accuracy or unweighted average recall). Other instances of the EC can be useful for applications in which some types of errors are more severe than others, or when the prior probabilities of the classes differ between the evaluation data and the use-case scenario. Surprisingly, the general form for the EC is rarely used in the machine learning literature. Instead, alternative ad-hoc metrics like the F-beta score and the Matthews correlation coefficient (MCC) are used for many applications. In this work, we argue that the EC is superior to these alternative metrics, being more general, interpretable, and adaptable to any application scenario. We provide both theoretically-motivated discussions as well as examples to illustrate the behavior of the different metrics.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=3mN9QNWArl
Changes Since Last Submission: This revision addresses the comments from the reviewers as follows: * Comments on the differences and similarities between our work and Dyrland's work were added (changes in Sections 1, 3.1, and 3.2) * Further discussions and references on the issue of cost selection were included (introduction of Section 2). * Discussions on the fact that calibrated scores are not needed for using the EC as classification metric were added (changes in Section 1, the introduction of Section 2, and Section 2.3) * Colors were added in Table 2 to emphasize comparisons discussed in the text. References to the corresponding colors were added within the text. All changes are highlighted in red in the new draft. We hope these changes satisfactorily address the reviewers concerns. We are open to further discussions and suggestions.
Assigned Action Editor: ~Daniel_M_Roy1
Submission Number: 2765
Loading