Separate Adjustment of Linear and Nonlinear Parameters in Neural Network Training

Sergei Sokolov; Vladimir Smolin; Podoprosvetov Aleksei

Separate Adjustment of Linear and Nonlinear Parameters in Neural Network Training

Sergei Sokolov, Vladimir Smolin, Podoprosvetov Aleksei

Published: 09 Mar 2025, Last Modified: 14 Mar 2025MathAI 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: vector-matrix analysis, backpropagation, neural networks

TL;DR: The study highlights the significance of specialized parameter optimization techniques in improving training efficiency and approximation accuracy in neural networks.

Abstract: The paper examines the limitations of the backpropagation error (BPE) method in neural network training, particularly its tendency to converge to suboptimal local minima. Traditional backpropagation-based training often suffers from inefficiencies in high-dimensional and complex optimization landscapes, which limits its effectiveness in deep learning applications. A modified neuron model is proposed, featuring adjustable parameters for nonlinear transformations such as ReLU and SoftPlus, which are adapted independently from connection weights. Unlike conventional models, which rely solely on weight optimization, our approach introduces independent parameter tuning for nonlinear transformations, allowing for more efficient exploration of the loss landscape. Based on vector-matrix analysis, the paper introduces an improved formal neuron model that reduces the likelihood of convergence to local minima far from the globally optimal solution. In the proposed model, the output activity is expressed as the sum of linear activation and its nonlinear transformation. This approach significantly enhances training speed and, in particular, approximation accuracy by introducing tunable parameters into the nonlinear function and optimizing them separately from the adjustment of input connection weights. The proposed model was evaluated on function approximation tasks of varying complexity in two- and three-dimensional spaces. The results demonstrate a 3–10 times reduction in training time and up to three orders of magnitude improvement in accuracy, especially for SoftPlus activation. These findings suggest that the proposed neuron model could be beneficial for deep learning applications requiring high precision and efficient training, such as medical imaging and autonomous systems. Additionally, the results emphasize the potential of vector-matrix analysis in improving neural network training methods, paving the way for further exploration of specialized optimization techniques.

Submission Number: 53

Loading