Principled Algorithms for Optimizing Generalized Metrics in Binary Classification

Anqi Mao; Mehryar Mohri; Yutao Zhong

Principled Algorithms for Optimizing Generalized Metrics in Binary Classification

Anqi Mao, Mehryar Mohri, Yutao Zhong

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC0 1.0

Abstract: In applications with significant class imbalance or asymmetric costs, metrics such as the $F_\beta$-measure, AM measure, Jaccard similarity coefficient, and weighted accuracy offer more suitable evaluation criteria than standard binary classification loss. However, optimizing these metrics present significant computational and statistical challenges. Existing approaches often rely on the characterization of the Bayes-optimal classifier, and use threshold-based methods that first estimate class probabilities and then seek an optimal threshold. This leads to algorithms that are not tailored to restricted hypothesis sets and lack finite-sample performance guarantees. In this work, we introduce principled algorithms for optimizing generalized metrics, supported by $H$-consistency and finite-sample generalization bounds. Our approach reformulates metric optimization as a generalized cost-sensitive learning problem, enabling the design of novel surrogate loss functions with provable $H$-consistency guarantees. Leveraging this framework, we develop new algorithms, METRO (*Metric Optimization*), with strong theoretical performance guarantees. We report the results of experiments demonstrating the effectiveness of our methods compared to prior baselines.

Lay Summary: In many real-world situations, it's much more important to get certain predictions right than others. For example, in medical diagnosis, failing to detect a disease (a "false negative") is often far worse than mistakenly flagging a healthy person for more tests (a "false positive"). Similarly, when searching for a rare but important piece of information, we want to be sure we find it, even if it means we get some irrelevant results along the way. Standard methods for training machine learning models often aren't designed for these scenarios where mistakes have unequal consequences or when one type of data is much rarer than another. They treat all errors as equally bad, which can lead to poor performance on the tasks we actually care about. Our research introduces a new, more flexible way to train machine learning models that can be directly tailored to these specific, real-world needs. Instead of using a one-size-fits-all approach, we've developed a method that allows us to define what a "good" outcome looks like for a particular problem and then directly teach the learning algorithm to optimize for that goal. This avoids a common two-step process where a standard model is first trained and then tweaked, which can be inefficient and unreliable. We call our new method METRO (Metric Optimization). We've proven mathematically that this approach is reliable and effective. In experiments, models trained with METRO outperformed existing methods on these specialized tasks, showing that our technique is a powerful tool for building more practical and effective machine learning systems.

Primary Area: General Machine Learning->Supervised Learning

Keywords: learning theory, consistency, generalized metrics

Submission Number: 7282

Loading