Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions
Abstract: In recent developments within the domain of deep learning, training algorithms have led to significant breakthroughs across diverse domains including speech, text, images, and video processing. While the research around deeper network architectures, notably exemplified by ResNet's expansive 152-layer structures, has yielded remarkable outcomes, the exploration of shallow Convolutional Neural Networks (CNN) remains an area for further exploration. Activation functions, crucial in introducing non-linearity within neural networks, have driven substantial advancements. In this paper, we delve into hidden layer activations, particularly examining their complex piece-wise linear attributes. Our comprehensive experiments showcase the superior efficacy of these piece-wise linear activations over traditional Rectified Linear Units across various architectures. We propose a novel Adaptive Activation algorithm, AdAct, exhibiting promising performance improvements in diverse CNN and multilayer perceptron configurations, thereby presenting compelling results in support of it's usage.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We found out that sub-section 4.4.1 (Shallow CNN results) under section 4.4 ( CNN Results) was missing. It was a mistake while compiling the latex file. This subsection has now been added and updated in the revised document.
We apologize for any inconvenience to the reviewers and action editor.
Assigned Action Editor: ~Nicolas_THOME2
Submission Number: 1544
Loading