Improving Model Merging with Natural Niches

João Abrantes; Robert Tjarko Lange; Yujin Tang

Improving Model Merging with Natural Niches

João Abrantes, Robert Tjarko Lange, Yujin Tang

Published: 10 Oct 2024, Last Modified: 19 Nov 2024AFM 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: model merging, evolution, diversity, niching, LLM

TL;DR: Uses model merging as the crossover operator in an evolutionary process to evolve image classifiers from scratch and combine pre-trained LLMs.

Abstract: Model merging is a powerful technique to combine specialized knowledge of multiple machine learning models into a single unified model. However, current methods require manually partitioning the model parameters into a fixed number of groups to be merged, which constraints the exploration of potential combinations and limits performance. To address these limitations, we propose an evolutionary algorithm with three key features: (1) dynamically adjustment of merging boundaries to progressively explore a broader range of parameter combinations; (2) a diversity preservation mechanism inspired by nature, which maintains a population of diverse, high-performing models that are particularly effective for merging; and (3) a heuristic-based \textit{mate selection} strategy to identify the most promising pairs of models for merging. Our experimental results show, for the first time, that model merging can be used to evolve models from \textit{scratch}. Specifically, we evolve MNIST classifiers from scratch using our method, and achieve comparable performance to CMA-ES, while being computationally cheaper. Additionally, we use our method to merge specialised language models and obtain state-of-the-art performance. Our code is available at https://github.com/AnonScientist/natural_niches.

Submission Number: 19

Loading