MC Layer Normalization for calibrated uncertainty in Deep Learning

Thomas Frick; Diego Antognini; Ioana Giurgiu; Benjamin F Grewe; Cristiano Malossi; Rong J.B. Zhu; Mattia Rigotti

MC Layer Normalization for calibrated uncertainty in Deep Learning

Thomas Frick, Diego Antognini, Ioana Giurgiu, Benjamin F Grewe, Cristiano Malossi, Rong J.B. Zhu, Mattia Rigotti

Published: 03 Mar 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Efficiently estimating the uncertainty of neural network predictions has become an increasingly important challenge as machine learning models are adopted for high-stakes industrial applications where shifts in data distribution may occur. Thus, calibrated prediction uncertainty is crucial to determine when to trust a model's output and when to discard them as implausible. We propose a novel deep learning module - MC Layer Normalization - that acts as a drop-in replacement for Layer Normalization blocks and endows a neural network with uncertainty estimation capabilities. Our method is motivated from an approximate Bayesian perspective, but it is simple to deploy with no significant computational overhead thanks to an efficient one-shot approximation of Monte Carlo integration at prediction time. To evaluate the effectiveness of our module, we conduct experiments in two distinct settings. First, we investigate its potential to replace existing methods such as MC-Dropout and Prediction-Time Batch Normalization. Second, we explore its suitability for use cases where such conventional modules are either unsuitable or sub-optimal for certain tasks (as is the case with modules based on Batch Normalization, which is incompatible for instance with transformers). We empirically demonstrate the competitiveness of our module in terms of prediction accuracy and uncertainty calibration on established out-of-distribution image classification benchmarks, as well as its flexibility by applying it on tasks and architectures where previous methods are unsuitable.

Submission Length: Regular submission (no more than 12 pages of main content)

Code: https://github.com/IBM/mc-layernorm

Assigned Action Editor: ~Seungjin_Choi1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1906

Loading