Beyond Parameter Averaging in Model Aggregation

Pol G. Recasens; Jordi Torres; Josep Lluis Berral; Søren Hauberg; Pablo Moreno-Muñoz

Beyond Parameter Averaging in Model Aggregation

Pol G. Recasens, Jordi Torres, Josep Lluis Berral, Søren Hauberg, Pablo Moreno-Muñoz

Published: 28 Oct 2023, Last Modified: 16 Dec 2023FL@FM-NeurIPS’23 PosterEveryoneRevisionsBibTeX

Student Author Indication: Yes

Keywords: self-supervised learning, Fisher merging, model aggregation

Abstract: The success of foundation models is strongly linked to scale, which has reinforced the interest in federated learning. With the prohibitive cost of training a large language model (LLM) in mind, little attention has been placed on reusing pre-trained models in collaborative training settings. Self-supervision has also played an important role in this success, but its emphasis has been primarily on data. This paper leverages Bayesian principles to bring self-supervision into the model aggregation toolbox. It introduces self-supervised Fisher merging, a framework that successfully merges models in parameter space without re-visiting data, opening a new door in model reusability. Experimental results build the foundation of our method on tractable linear models, and highlight its potential on aggregating neural networks.

Submission Number: 24

Loading