Nonparametrically Learning Activation Functions in Deep Neural NetsDownload PDF

06 Jun 2023 (modified: 21 Jul 2022)ICLR 2017 Invite to WorkshopReaders: Everyone
TL;DR: A new class of nonparametric activation functions for deep learning with theoretical guarantees for generalization error.
Abstract: We provide a principled framework for nonparametrically learning activation functions in deep neural networks. Currently, state-of-the-art deep networks treat choice of activation function as a hyper-parameter before training. By allowing activation functions to be estimated as part of the training procedure, we expand the class of functions that each node in the network can learn. We also provide a theoretical justification for our choice of nonparametric activation functions and demonstrate that networks with our nonparametric activation functions generalize well. To demonstrate the power of our novel techniques, we test them on image recognition datasets and achieve up to a 15% relative increase in test performance compared to the baseline.
11 Replies