Keywords: Random Matrix Theory, Critical and Robust Layers, Data-Free Methods
TL;DR: We explore whether data-free metrics are reparameterisation invariant under the critical and robust layer phenomena and find that they lack predictive capacity in this setting.
Abstract: Data-free methods for analysing and understanding the layers of neural networks have offered many metrics for quantifying notions of ``strong" versus ``weak" layers, with the promise of increased interpretability. We examine the robustness and predictive power of data-free metrics under randomised control conditions across a wide range of models, datasets and architectures. Contrary to some of the literature, we find strong evidence \emph{against} the efficacy of data-free methods. We show that they are not reparametrisation-invariant even for \emph{robust} layers, that is to say layers that can be reparametrised by re-initialisation or re-randomisation without affecting the accuracy of the model. Moreover, we also show that data-free metrics cannot be used for the arguably simpler tasks of (i) distinguishing between robust layers and critical layers, i.e.\ layers that cannot be reparametrised without significantly degrading the accuracy of the model, or (ii) predicting if there will be a performance difference between re-initialisation and re-randomisation. Thus, we argue that to understand neural networks, and in particular the difference between `strong" versus ``weak" layers, we must adopt mechanistic and functional approaches, contrary to the traditional Random Matrix Theory perspective.
Primary Area: learning theory
Submission Number: 18989
Loading