Hamiltonian Monte Carlo on ReLU Neural Networks is Inefficient

Vu C. Dinh; Lam Si Tung Ho; Cuong V. Nguyen

Hamiltonian Monte Carlo on ReLU Neural Networks is Inefficient

Vu C. Dinh, Lam Si Tung Ho, Cuong V. Nguyen

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Hamiltonian Monte Carlo, efficiency, ReLU, optimal acceptance probability

TL;DR: We show that due to the non-differentiability of activation functions in the ReLU family, leapfrog HMC for networks with these activation functions has a large local error rate, making the method inefficient.

Abstract: We analyze the error rates of the Hamiltonian Monte Carlo algorithm with leapfrog integrator for Bayesian neural network inference. We show that due to the non-differentiability of activation functions in the ReLU family, leapfrog HMC for networks with these activation functions has a large local error rate of $\Omega(\epsilon)$ rather than the classical error rate of $\mathcal{O}(\epsilon^3)$. This leads to a higher rejection rate of the proposals, making the method inefficient. We then verify our theoretical findings through empirical simulations as well as experiments on a real-world dataset that highlight the inefficiency of HMC inference on ReLU-based neural networks compared to analytical networks.

Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)

Submission Number: 13875

Loading