Are All Layers Created Equal?

Chiyuan Zhang, Samy Bengio, Yoram Singer

May 17, 2019 ICML 2019 Workshop Deep Phenomena Blind Submission readers: everyone
  • Abstract: Understanding learning with deep architectures has been a major research objective in the recent years with notable theoretical progress. A main focal point of those studies stems from the success of excessively large networks. We study empirically the layer-wise functional structure of overparameterized deep models. We provide evidence for the heterogeneous characteristic of layers. To do so, we introduce the notion of (post training) re-initialization and re-randomization robustness. We show that layers can be categorized into either ``robust'' or ``critical''. In contrast to critical layers, resetting the robust layers to their initial value has no negative consequence, and in many cases they barely change throughout training. Our study provides evidence flatness or robustness analysis of the model parameters needs to respect the network architectures.
0 Replies

Loading