S-System, Geometry, Learning, and Optimization: A Theory of Neural Networks

Shuai Li; Kui Jia

S-System, Geometry, Learning, and Optimization: A Theory of Neural Networks

Shuai Li, Kui Jia

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We present a formal measure-theoretical theory of neural networks (NN) built on {\it probability coupling theory}. Particularly, we present an algorithm framework, Hierarchical Measure Group and Approximate System (HMGAS), nicknamed S-System, of which NNs are special cases. In addition to many other results, the framework enables us to prove that 1) NNs implement {\it renormalization group (RG)} using information geometry, which points out that the large scale property to renormalize is dual Bregman divergence and completes the analog between NNs and RG; 2) and under a set of {\it realistic} boundedness and diversity conditions, for {\it large size nonlinear deep} NNs with a class of losses, including the hinge loss, all local minima are global minima with zero loss errors, using random matrix theory.

Keywords: neural network theory, probability measure theory, probability coupling theory, S-System, optimization, random matrix, renormalization group, information geometry, coarse graining, hierarchy, activation function, symmetry

TL;DR: We present a formal measure-theoretical theory of neural networks (NN) that quantitatively shows NNs renormalize on semantic difference, and under practical conditions large size deep nonlinear NNs can optimize objective functions to zero losses.

3 Replies

Loading