Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient

ShaoQi Wang; Chunjie Yang; Siwei Lou

Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient

ShaoQi Wang, Chunjie Yang, Siwei Lou

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Neural networks, network's structure design, minimum variance estimation, online learning, training stability, natural gradient, soft sensor

TL;DR: A novel network structure that efficiently approximates minimum variance estimation and computes natural gradient achieving stable convergence and superior performance in industrial setting.

Abstract: Neural networks (NN) are extensively studied in cutting-edge soft sensor models due to their feature extraction and function approximation capabilities. Current research into network-based methods primarily focuses on models' offline accuracy. Notably, in industrial soft sensor context, online optimizing stability and interpretability are prioritized, followed by accuracy. This requires a clearer understanding of network's training process. To bridge this gap, we propose a novel NN named the Approximated Orthogonal Projection Unit (AOPU) which has solid mathematical basis and presents superior training stability. AOPU truncates the gradient backpropagation at dual parameters, optimizes the trackable parameters updates, and enhances the robustness of training. We further prove that AOPU attains minimum variance estimation in NN, wherein the truncated gradient approximates the natural gradient. Empirical results on two chemical process datasets clearly show that AOPU outperforms other models in achieving stable convergence, marking a significant advancement in soft sensor field.

Primary Area: Deep learning architectures

Submission Number: 8079

Loading