Abstract: Activation functions(AFs) are crucial components in neural networks for deep learning. Piecewise linear functions(PLFs) have been widely employed to act as AFs, thanks to their computational efficiencies and simplicities. However, PLFs are not completely differentiable, have potential problems of training. The analytical expressions of AFs based on pure PLFs can be smoothed via mollified square root function(MSRF) method, inspired by SquarePlus method of ReLU approximation. In this paper, we propose a proposition defining AFs as maximum or minimum of two PLFs, and transform the results into a smoothed function via MSRF method. Based on MSRF, we modify the well-known AFs composed of two PLFs, three or four PLFs to regularized ones systematically, including ReLU, LReLU, vReLU, Step, Biplolar, BReLU(Bounded ReLU), Htanh(Hard Tanh), Pan(Frying pan function), STF(Soft Thresholding Formulas), HTF(Soft Thresholding Formulas), SReLU(S-shaped ReLU), MReLU(Mexican hat type ReLU), TSF(Trapezoid-shaped function) functions. Additionally, according to the equivalences of SquarePlus and SoftPlus functions, some classic compound AFs, such as ELU, Swish, Mish, SoftSign Logish and DLU functions, can be expressed via MSRF method also. The derivatives of their mollified versions demonstrate their smoothness properties. The proposed method can be extended to AFs composed of multiple PLFs easily, which will be investigated deeply in the future. Keywords: Activation functions, piecewise linear functions, smoothness, mollified square root functions. Mathematics Subject Classification: Primary: 65D15. Citation: \begin{equation} \\ \end{equation}
Loading