Meta-Merging by Checkpoint Nowcasting

Albert Manuel Orozco Camacho; Boris Knyazev; Eugene Belilovsky; Guy Wolf

Meta-Merging by Checkpoint Nowcasting

Albert Manuel Orozco Camacho, Boris Knyazev, Eugene Belilovsky, Guy Wolf

Published: 10 Jun 2026, Last Modified: 17 Jun 2026LXAI @ ICML 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: model merging, weight-space learning, nowcasting, transfer learning

TL;DR: We re-purpose a weight-space nowcasting architecture, as a zero-shot model merging operator.

Abstract: Model merging—the direct combination of parameters from independently fine-tuned networks—offers a way to compose task-specific capabilities without retraining or ensemble inference. Existing merge methods are often built from hand-crafted arithmetic or sparsification heuristics, leaving open whether general learned weight-space operators can be repurposed for merging directly. We study this question with NiNo, a pre-trained checkpoint-nowcasting meta-network originally designed to predict near-future training states from short checkpoint histories. We show that pre-trained NiNo can be reused as a data-free pairwise meta-merge operator for independently fine-tuned models. On an 8-task CLIP ViT-B/16 benchmark, NiNo is competitive with strong arithmetic baselines and consistently lands in the same functional region as weight averaging, Task Arithmetic, and TIES. Moreover, NiNo is best on HumanEval in a Qwen3 language extension among the compared merge methods, while extending meta-merge beyond pairs remains an open challenge. These results position learned checkpoint nowcasting as a practical starting point for data-free model merging and motivate future weight-space learners trained for merging explicitly.

Submission Category: Full Paper

Overaged Verification: Yes

Latin American Hispanic Heritage: Yes

Icml Proceedings Status: No

Submission Number: 32

Loading