Keywords: model merging, weight-space learning, nowcasting, transfer learning
TL;DR: We re-purpose a weight-space nowcasting architecture, as a zero-shot model merging operator.
Abstract: Model merging—the direct combination of parameters from independently fine-tuned networks—offers a way to compose task-specific capabilities without retraining or ensemble inference. Existing merge methods are often built from hand-crafted arithmetic or sparsification
heuristics, leaving open whether general learned weight-space operators can be repurposed for merging directly. We study this question with NiNo, a pre-trained checkpoint-nowcasting meta-network originally designed to predict near-future training states from short checkpoint histories. We show that pre-trained NiNo can be reused as a data-free pairwise meta-merge operator for independently fine-tuned models. On an 8-task CLIP ViT-B/16 benchmark, NiNo is competitive with strong arithmetic baselines and consistently lands in the same functional region as weight averaging, Task Arithmetic, and TIES. Moreover, NiNo is best on HumanEval in a Qwen3 language extension among the compared merge methods, while extending meta-merge beyond pairs remains an open challenge. These results position learned checkpoint nowcasting as a practical starting point for data-free model merging and motivate future weight-space learners trained for merging explicitly.
Submission Category: Full Paper
Overaged Verification: Yes
Latin American Hispanic Heritage: Yes
Icml Proceedings Status: No
Submission Number: 32
Loading