1-Path-Norm Regularization of Deep Neural Networks

Fabian Latorre; Antoine Bonnet; Paul Rolland; Nadav Hallak; Volkan Cevher

1-Path-Norm Regularization of Deep Neural Networks

Fabian Latorre, Antoine Bonnet, Paul Rolland, Nadav Hallak, Volkan Cevher

Published: 03 Jul 2023, Last Modified: 03 Jul 2023LXAI @ ICML 2023 Regular Deadline OralEveryoneRevisionsBibTeX

Keywords: nonconvex optimization, path norm, neural networks, deep learning, robustness, generalization

TL;DR: we develop a new regularization methods for deep neural networks that increases accuracy and robustness, compared to weight decay

Abstract: The so-called path-norm measure is considered one of the best indicators for good generalization of neural networks. This paper introduces a proximal gradient framework for the training of deep neural networks via 1-path-norm regularization, which is applicable to general deep architectures. We address the resulting nonconvex nonsmooth optimization model by transforming the intractable induced proximal operation to an equivalent differentiable proximal operation. We compare automatic differentiation (backpropagation) algorithms with the proximal gradient framework in numerical experiments on FashionMNIST and CIFAR10. We show that 1-path-norm regularization is a better choice than weight-decay for fully connected architectures, and it improves the robustness to the presence of noisy labels. In this latter setting, the proximal gradient methods have an advantage over automatic differentiation.

Submission Type: Non-Archival

Submission Number: 13

Loading