SinP[N]: A Fast Convergence Activation Function for Convolutional Neural Networks

Published: 01 Jan 2018, Last Modified: 14 Nov 2024UCC Companion 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Convolutional Neural Networks (CNNs) are currently the most advanced machine learning architecture for visual data classification. The choice of activation functions has a significant impact on the performance of a training task. In order to overcome the vanishing gradient problem, we propose a new activation function for the classification system. The activation function makes use of the properties of periodic functions, where the derivative of a periodic function is also periodic. Furthermore, a linear combination is introduced to prevent the derivative from becoming zero. We verify this novel activation function by training an empirical analysis and comparing with the currently discovered activation functions. Experimental results show that our activation function SinP [ N ]( x ) = sin( x )+ Nx , leads to very fast convergence even without the normalization layer. As a result, this new activation function enhances the training accuracy significantly, and can be easily deployed in the current systems built upon the standard CNN architecture.
Loading