Logarithmic Linear Units (LogLUs): A Novel Activation Function for Training Deep Neural Networks

Rishi Chaitanya Sri Prasad Nalluri

Logarithmic Linear Units (LogLUs): A Novel Activation Function for Training Deep Neural Networks

Rishi Chaitanya Sri Prasad Nalluri

16 Sept 2023 (modified: 03 Oct 2024)ICLR 2024 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeX

Keywords: Activation Function, Gradient Descent, Model Generalization, Neural Networks, Deep learning.

TL;DR: A new activation function that's designed to train deep neural networks with a strong emphasis on achieving a generalized model.

Abstract: The Logarithmic Linear Unit (LogLU) introduces a novel approach to activation functions in deep neural networks. By incorporating logarithmic methods into its mathematical equation, LogLU revolutionizes the training process, leading to faster learning and improved accuracy across diverse datasets, including numerical, image, and time series data. Much like Rectified linear unit (ReLU), Leaky ReLU, and Exponential Linear Unit (ELU), LogLU effectively tackles the vanishing gradient problem and mitigates the dead neuron issue that plagues ReLU. LogLU also has the ability to produce negative values, driving the mean unit activation closer to zero. This concept is inspired by stochastic gradient descent, which rapidly approaches the global minimum with a high learning rate before taking smaller steps as it nears the minimum. In experiments, LogLU not only accelerates learning but also yields more generalized models compared to other activation functions. Its primary goal in building deep neural networks is to achieve high generalization, ensuring that training and test accuracies closely align. We evaluated LogLU performance on three diverse datasets and its accuracy was: (i) Breast Cancer – Numerical Dataset (0.91) (ii) MNIST – Image Dataset (0.95) (iii) Jena Climate - Time Series Analysis Dataset (0.99). Results demonstrate that LogLU outperforms other activation functions in terms of learning characteristics. It represents a significant advancement in deep learning, offering researchers and practitioners a powerful tool to enhance neural network performance and generalization.

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 643

Loading