Newton Residual Learning

Grigorios Chrysos; Jiankang Deng; Yannis Panagakis; Stefanos Zafeiriou

Newton Residual Learning

Grigorios Chrysos, Jiankang Deng, Yannis Panagakis, Stefanos Zafeiriou

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: We demonstrate how residual blocks can be viewed as Gauss-Newton steps; we propose a new residual block that exploits second order information.

Abstract: A plethora of computer vision tasks, such as optical flow and image alignment, can be formulated as non-linear optimization problems. Before the resurgence of deep learning, the dominant family for solving such optimization problems was numerical optimization, e.g, Gauss-Newton (GN). More recently, several attempts were made to formulate learnable GN steps as cascade regression architectures. In this paper, we investigate recent machine learning architectures, such as deep neural networks with residual connections, under the above perspective. To this end, we first demonstrate how residual blocks (when considered as discretization of ODEs) can be viewed as GN steps. Then, we go a step further and propose a new residual block, that is reminiscent of Newton's method in numerical optimization and exhibits faster convergence. We thoroughly evaluate the proposed Newton-ResNet by conducting experiments on image and speech classification and image generation, using 4 datasets. All the experiments demonstrate that Newton-ResNet requires less parameters to achieve the same performance with the original ResNet.

Keywords: Residual learning, Resnet, Newton

Original Pdf: pdf

6 Replies

Loading