Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes

Anonymous

Confidence Scoring Using Whitebox Meta-models with Linear Classifier Probes

Anonymous

27 Oct 2017 (modified: 13 Apr 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: We propose a confidence scoring mechanism for multi-layer neural networks based on a paradigm of a base model and a meta-model. The confidence score is learned by the meta-model using features derived from the base model – a deep neural network considered a whitebox. As features, we investigate linear classifier probes inserted between the various layers of the base model and trained using each layer’s intermediate activations. Experiments show that this approach outperforms various baselines in a filtering task, i.e., task of rejecting samples with low confidence. Experimental results are presented using CIFAR-10 and CIFAR-100 dataset with and without added noise exploring various aspects of the method.

Keywords: confidence scoring, meta-model, linear classifier probes

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/confidence-scoring-using-whitebox-meta-models/code)

Withdrawal: Confirmed

0 Replies

Loading