A theoretical framework for deep and locally connected ReLU network

Yuandong Tian

A theoretical framework for deep and locally connected ReLU network

Yuandong Tian

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Understanding theoretical properties of deep and locally connected nonlinear network, such as deep convolutional neural network (DCNN), is still a hard problem despite its empirical success. In this paper, we propose a novel theoretical framework for such networks with ReLU nonlinearity. The framework bridges data distribution with gradient descent rules, favors disentangled representations and is compatible with common regularization techniques such as Batch Norm, after a novel discovery of its projection nature. The framework is built upon teacher-student setting, by projecting the student's forward/backward pass onto the teacher's computational graph. We do not impose unrealistic assumptions (e.g., Gaussian inputs, independence of activation, etc). Our framework could help facilitate theoretical analysis of many practical issues, e.g. disentangled representations in deep networks.

Keywords: theoretical analysis, deep network, optimization, disentangled representation

TL;DR: This paper presents a theoretical framework that models data distribution explicitly for deep and locally connected ReLU network

11 Replies

Loading