Quantile Activation: Correcting a failure mode of ML models

Aditya Challa; Sravan Danda; Laurent Najman; Snehanshu Saha

Quantile Activation: Correcting a failure mode of ML models

Aditya Challa, Sravan Danda, Laurent Najman, Snehanshu Saha

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Machine Learning Foundations, Quantiles, Distribution Shift

TL;DR: Current NN models cannot infer context and make an inference. We correct it using quantile activation.

Abstract: An established failure mode for machine learning models occurs when the same features are equally likely to belong to class $0$ and class $1$.. In such cases, any ML model cannot to correctly classify the sample. However, a solvable case emerges when the probabilities of class $0$ and $1$ vary with the "context distribution". To the best of our knowledge, standard neural network architectures like MLPs or CNNs are not equipped to handle this. In this article, we propose a simple activation function, quantile activation (QACT), that addresses this problem without significantly increasing computational costs. The core idea is to "adapt" the outputs of each neuron to its *context distribution*. The proposed quantile activation, QACT, produces the "relative quantile" of the sample in its context distribution, rather than the actual values, as in traditional networks. A practical example where the same sample can have different labels arises in cases of inherent distribution shift. We validate the proposed activation function under such shifts, using datasets designed to test robustness against distortions—CIFAR10C, CIFAR100C, MNISTC, TinyImagenetC. Our results demonstrate significantly better generalization across distortions compared to conventional classifiers, across various architectures. Although this paper presents a proof of concept, we find that this approach unexpectedly outperforms DINOv2 (small) under large distortions, despite DINOv2 being trained with a much larger network and dataset.

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12728

Loading