- Abstract: We propose Noisy Information Bottlenecks (NIB) to limit mutual information between learned parameters and the data through noise. We show why this benefits generalization and allows mitigation of model overfitting both for supervised and unsupervised learning, even for arbitrarily complex architectures. We reinterpret methods including the Variational Autoencoder, beta-VAE, network weight uncertainty and a variant of dropout combined with weight decay as special cases of our approach, explaining and quantifying regularizing properties and vulnerabilities within information theory.
- Keywords: information theory, deep learning, generalization, information bottleneck, variational inference, approximate inference
- TL;DR: We limit mutual information between parameters and data using noise to improve generalization in deep models.