Difference back propagation with inverse sigmoid function

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Learning, AI, Algorithm, Back Propagation
TL;DR: We propose a new back propagation algorithm that calculates the back propagatiion updates using the difference instead of the derivative from the activation function
Abstract: Since the proposal of neural network, the derivative-based back propagation algorithm has been the default setting. However, the derivative for a non-linear function is an approximation for the difference of the function values, and it would be a more precise way to do back propagation using the difference directly instead of the derivative. While the back propagation algorithm has been the rule-of-thumb for neural networks, it becomes one of the bottleneck in modern large deep learning models. With the explosion of big data and large-scale deep learning models, a tiny change in the back propagation could lead to a huge difference. Here we propose a new back propagation algorithm based on inverse sigmoid function to calculate the difference instead of derivative, and verified the effectiveness with basic examples.
Primary Area: optimization
Submission Number: 15419
Loading