Dynamic Activation Function Based on the Branching Process and its Application in Image Classification
Abstract: The choice of activation function in deep learning is crucial to the performance of neural networks. The activation function used in conventional deep learning remains unchanged for neural networks of different depths, leading to performance degradation as the depth of the model increases. In this paper, we propose a $\operatorname{sigmoid}_{n}$ dynamic activation function that can change with the depth of the neural network. We firstly introduced the dual relationship between the activation function and the probability generating function(PGF) from the perspective of the branching process, and explained the reason why the model performance of different activation functions decreases as the neural network deepens. Then, we use the law of large numbers in the super critical branching process to optimize the PGF and propose the sigmoid ${ }_{n}$ dynamic activation function through the dual relationship between the PGF and the activation function. Finally, to better extract the spatial context information of the image, we add a convolution channel based on the sigmoid ${ }_{n}$ dynamic activation function and propose a two-dimensional Fsigmoid ${ }_{n}$ dynamic activation function. Experiments on CIFAR-10 and CIFAR-100 datasets verify the superiority of the proposed sigmoid ${ }_{n}$ activation function.
Loading