Keywords: rescience c, machine learning, deep learning, python, pytorch
TL;DR: Does ReLU'(0) impact neural network training? Reproduce and run additional analysis of numerical influence on training stability.
Abstract: Neural networks have become very common in machine learning, and new problems and trends arise as the trade-off between theory, computational tools and real-world problems become more narrow and complex. We decided to retake the influence of the ReLU'(0) on the backpropagation as it has become more common to use lower floating point precisions in the GPUs so that more tasks can run in parallel and make training and inference more efficient. As opposed to what theory suggests, the original authors shown that when using 16- and 32-bit precision, the value of ReLU'(0) may influence the result. In this work we extended some experiments to see how the training and test loss are affected in simple and more complex models.
Paper Url: https://proceedings.neurips.cc/paper/2021/file/043ab21fc5a1607b381ac3896176dac6-Paper.pdf
Paper Review Url: https://openreview.net/forum?id=urrcVI-_jRm
Paper Venue: Other venue (not in list)
Venue Name: NeurIPS 2021
Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script
Journal: ReScience Volume 9 Issue 2 Article 17