How Can Deep Learning Performs Deep (Hierarchical) Learning

Zeyuan Allen-Zhu; Yuanzhi Li

How Can Deep Learning Performs Deep (Hierarchical) Learning

Zeyuan Allen-Zhu, Yuanzhi Li

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: We present a theory to study *how* deep neural networks (of even super-constantly many layers) can perform hierarchical feature learning, on tasks that are not known to be efficiently solvable by non-hierarchical methods (such as kernel methods).

Abstract: (this is a theory paper) Deep learning is also known as hierarchical learning, where the learner $\textit{learns}$ to represent a complex target function by decomposing it into a sequence of simpler functions to reduce sample and time complexity. This paper formally analyzes how multi-layer neural networks can perform such hierarchical learning $\textit{efficiently}$ and $\textit{automatically}$ by applying stochastic gradient descent (SGD) or its variants. On the conceptual side, we present a characterizations of how certain deep (i.e. super-constantly many layers) neural networks can still be sample and time efficiently trained on hierarchical learning tasks, when no known existing algorithm (including layer-wise training, kernel method, etc) is efficient. We establish a new principle called ``backward feature correction'', where \emph{the errors in the lower-level features can be automatically corrected when training together with the higher-level layers}. We believe this is a key behind how deep learning is performing deep (hierarchical) learning, as opposed to layer-wise learning or simulating some known non-hierarchical method.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Theory (eg, control theory, learning theory, algorithmic game theory)

Supplementary Material: zip

9 Replies

Loading