Improving Model Merging with Natural Niches

Published: 10 Oct 2024, Last Modified: 24 Oct 2024UniRepsEveryoneRevisionsBibTeXCC BY 4.0
Track: Proceedings Track
Keywords: model merging, evolution, diversity, niching, LLM
TL;DR: Uses model merging as the crossover operator in an evolutionary process to evolve image classifiers from scratch and combine pre-trained LLMs.
Abstract: Model merging is a powerful technique to combine specialized knowledge of multiple machine learning models into a single unified model. However, current methods require manually partitioning the model parameters into a fixed number of groups to be merged, which constraints the exploration of potential combinations and limits performance. To address these limitations, we propose an evolutionary algorithm with three key features: (1) dynamically adjustment of merging boundaries to progressively explore a broader range of parameter combinations; (2) a diversity preservation mechanism inspired by nature, which maintains a population of diverse, high-performing models that are particularly effective for merging; and (3) a heuristic-based \textit{mate selection} strategy to identify the most promising pairs of models for merging. Our experimental results show, for the first time, that model merging can be used to evolve models from \textit{scratch}. Specifically, we evolve MNIST classifiers from scratch using our method, and achieve comparable performance to CMA-ES, while being computationally cheaper. Additionally, we use our method to merge specialised language models and obtain state-of-the-art performance. Our code is available at
Submission Number: 29