EvoGM: Learning to Merge LLMs via Evolutionary Generative Optimization

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We replace hand-crafted evolutionary operators in model merging with a learnable paired generative model that predicts merging coefficients, yielding better and more efficient merges.
Abstract: Evolutionary model merging provides a powerful framework for the automated, training-free composition of LLMs through parameter-space search. However, existing methods predominantly rely on stochastic, hand-crafted operators that overlook the underlying performance landscape of the coefficient space. We propose Evolutionary Generative Merging (EvoGM), a framework that transcends manual heuristics by employing learnable generative modeling to optimize merging coefficients. Specifically, EvoGM features a dual-generator architecture with cycle-consistent learning to adaptively sample and refine promising merging candidates. By constructing winner-loser pairs from historical search trajectories, our framework effectively captures high-performance parameter distributions and maximizes data efficiency. This generative process is seamlessly integrated into a multi-round evolutionary pipeline, where elite merged models iteratively serve as new expert foundations. Extensive experiments across diverse benchmarks demonstrate that EvoGM significantly outperforms state-of-the-art baselines, exhibiting robust performance on both seen and unseen tasks. Code and data are available at https://github.com/JiangTao97/evogm.
Lay Summary: Artificial intelligence models are often adapted for different jobs, such as answering knowledge questions, solving math problems, or following instructions. Training a new large model for every job is expensive, so researchers often try to combine several specialized models into one. However, deciding how to combine them is usually done by trial and error, which can waste time and miss better combinations. We introduce EvoGM, a method that learns from previous combination attempts. Instead of making random changes, it compares stronger and weaker attempts, uses that history to suggest more promising new combinations, and repeats the process over several rounds. In this way, the search becomes more guided over time. Across a range of language model tasks, EvoGM produced combined models that performed better than existing merging methods, including on tasks not directly used during the search. This suggests that smarter search methods can help build more capable and efficient AI systems without retraining them from scratch.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/JiangTao97/evogm
Primary Area: Deep Learning->Large Language Models
Keywords: model merging, evolutionary algorithm, evolutionary generative optimization
Originally Submitted PDF: pdf
Submission Number: 1634
Loading