On the Importance of Full Rank Initializations in Deep Neural Networks

Vaibhav B Sinha*; Vishwak Srinivasan*; Vineeth N B

On the Importance of Full Rank Initializations in Deep Neural Networks

Vaibhav B Sinha, Vishwak Srinivasan, Vineeth N B

17 May 2019 (modified: 05 May 2023)Submitted to ICML Deep Phenomena 2019Readers: Everyone

Abstract: Several methods have been proposed over the last few years for initializing the weights of neural networks in order to converge to better solutions or to reduce the time taken for convergence. On the other hand, there have been recent efforts connecting the full rank nature of weight matrices with the optimality of the final converged solution. In this work, we study the connection between popular initialization methods and the conditions necessary at optimal solution using deep linear networks with the squared loss. Through this connection, we attempt to provide a new explanation as to why these different initialization methods work well in practice.

1 Reply

Loading

On the Importance of Full Rank Initializations in Deep Neural Networks

Vaibhav B Sinha*, Vishwak Srinivasan*, Vineeth N B

Vaibhav B Sinha, Vishwak Srinivasan, Vineeth N B