Keywords: model merging, evolution, diversity, niching, LLM
TL;DR: Uses model merging as the crossover operator in an evolutionary process to evolve image classifiers from scratch and combine pre-trained LLMs.
Abstract: Model merging is a powerful technique to combine specialized knowledge of multiple machine learning models into a single unified model.
However, current methods require manually partitioning the model parameters into a fixed number of groups to be merged, which constraints the exploration of potential combinations and limits performance.
To address these limitations, we propose an evolutionary algorithm with three key features:
(1) dynamically adjustment of merging boundaries to progressively explore a broader range of parameter combinations;
(2) a diversity preservation mechanism inspired by nature, which maintains a population of diverse, high-performing models that are particularly effective for merging;
and (3) a heuristic-based \textit{mate selection} strategy to identify the most promising pairs of models for merging. Our experimental results show, for the first time, that model merging can be used to evolve models from \textit{scratch}.
Specifically, we evolve MNIST classifiers from scratch using our method, and achieve comparable performance to CMA-ES, while being computationally cheaper.
Additionally, we use our method to merge specialised language models and obtain state-of-the-art performance.
Our code is available at https://github.com/AnonScientist/natural_niches.
Submission Number: 19
Loading