Understanding Permutation Based Model Merging with Feature Visualizations

Published: 10 Oct 2024, Last Modified: 04 Nov 2024UniRepsEveryoneRevisionsBibTeXCC BY 4.0
Track: Extended Abstract Track
Keywords: Deep learning, Linear mode connectivity, Convolutional neural network, Feature Visualization, Model Merging
TL;DR: We study the inner mechanisms behind linear mode connectivity in model merging through feature visualization methods.
Abstract: Linear mode connectivity (LMC) has become a topic of great interest in recent years. It has been empirically demonstrated that popular deep learning models trained from different initializations exhibit linear model connectivity up to permutation. Based on this, several approaches for finding a permutation of the model's features or weights have been proposed leading to several popular methods for model merging. These methods enable the simple averaging of two models to create a new high-performance model. However, besides accuracy, the properties of these models and their relationships to the representations of the models they derive from are poorly understood. In this work, we study the inner mechanisms behind LMC in model merging through the lens of classic feature visualization methods. Focusing on convolutional neural networks (CNNs) we make several observations that shed light on the underlying mechanisms of model merging by permute and average.
Submission Number: 36
Loading