Mathematical Characterization of Better-than-Random Multiclass Models

Sébastien Foulle

Mathematical Characterization of Better-than-Random Multiclass Models

Sébastien Foulle

Published: 02 Jun 2025, Last Modified: 02 Jun 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: A binary supervised model outperforms chance if and only if the determinant of the confusion matrix is positive. This is equivalent to saying that the associated point in the ROC space is above the random guessing line. This also means that Youden's J, Cohen's $\kappa$ and Matthews' correlation coefficient are positive. We extend these results to any number of classes: for a target variable with $m \geq 2$ classes, we show that a model does better than chance if and only if the entries of the confusion matrix verify $m(m-1)$ homogeneous polynomial inequalities of degree 2, which can be expressed using generalized likelihood ratios. We also obtain a more theoretical formulation: a model does better than chance if and only if it is a maximum likelihood estimator of the target variable. When this is the case, we find that the multiclass versions of the previous metrics remain positive. If $m>2$, we notice that no-skill classifiers are only a small part of the topological boundary between better-than-random models and bad models. For $m=3$, we show that bad models occupy exactly 90\% of the ROC space, far more than the 50\% of the two-class problems. Finally, we propose to define weak multiclass classifiers by conditions on these generalized likelihood ratios.

Submission Length: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=gZ0Ae5rgJq

Changes Since Last Submission: The de-anonymization of the article resulted in the confusion matrix being incorrectly positioned in Definition 1. The following modifications have corrected the problem: the first two points of Definition 1 have been merged, and the point on prevalence has been very slightly modified. The very first sentence of the section "Multiclass Models and Metrics" on page 9 has been completed with the definition of $\hat{y}$. Omissions have been corrected in the statement of Corollary 2 (page 10): parentheses have been added, as well as a sentence, for consistency with the statement of the preceding theorem. A "Acknowledgments" section has been added.

Assigned Action Editor: ~Sivan_Sabato1

Submission Number: 4577

Loading