Keywords: Loss landscapes, Mechanisms, Mode Connectivity
Abstract: With the rise of pretrained models, fine-tuning has become of central importance in deep learning. However, unlike retraining from scratch, fine-tuning can fail to qualitatively change the behavior of a pre-trained network. For instance, we find in practice that naive fine-tuning does not eliminate a model’s sensitivity to spurious features. To understand and address this limitation, we study the geometry of neural network loss landscapes through the lens of mode-connectivity. Our work addresses two questions about mode-connectivity: 1) Are models trained on different data distributions mode-connected? 2) Can we fine tune a pre-trained model to switch modes? We define a notion of mechanistic mode-connectivity, and find that only models that already share the same invariances (which we call “mechanistically similar”) are mechanistically mode-connected. We hypothesize this property explains inability of naive fine-tuning methods to induce invariance to spurious features. Based on our analysis, we propose and validate a method of “mechanistic fine-tuning” called connectivity-based fine-tuning (CBFT)
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2211.08422/code)