Revealing the Structure of Deep Neural Networks via Convex Duality

Tolga Ergen; Mert Pilanci

Revealing the Structure of Deep Neural Networks via Convex Duality

Tolga Ergen, Mert Pilanci

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Convex optimization, non-convex optimization, deep learning, convex duality, regularization, ReLU activation, linear networks

Abstract: We study regularized deep neural networks (DNNs) and introduce a convex analytic framework to characterize the structure of the hidden layers. We show that a set of optimal hidden layer weights for a norm regularized DNN training problem can be explicitly found as the extreme points of a convex set. For the special case of deep linear networks with $K$ outputs, we prove that each optimal weight matrix is rank-$K$ and aligns with the previous layers via duality. More importantly, we apply the same characterization to deep ReLU networks with whitened data and prove the same weight alignment holds. As a corollary, we prove that norm regularized deep ReLU networks yield spline interpolation for one-dimensional datasets which was previously known only for two-layer networks. Furthermore, we provide closed-form solutions for the optimal layer weights when data is rank-one or whitened. We then verify our theory via numerical experiments.

One-sentence Summary: We study norm regularized deep neural networks and develop a framework based on convex duality such that a set of optimal solutions to the training problem can be explicitly and analytically characterized.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=nTu-0M2JQ

11 Replies

Loading