Localising Failure between Representation and Readout: A Fresh-Head Probe for Parameter-Space Model Merging

Localising Failure between Representation and Readout: A Fresh-Head Probe for Parameter-Space Model Merging

TMLR Paper8964 Authors

15 May 2026 (modified: 22 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Parameter-space merging has become increasingly sophisticated about how independently fine-tuned models should be combined, but less explicit about what the resulting post-merge score is allowed to diagnose. The native end-to-end accuracy of a merged model is a valid verdict on the system delivered by the merge; it is not, by itself, evidence that failure entered through the merged representation rather than through the readout the merge also delivered. We formalise this distinction with a fresh-head probe: the merged backbone is held fixed, the union-label readout is re-estimated under matched supervision, and the native--fresh-head gap identifies the readout-recoverable component of the native shortfall. In a controlled CIFAR-100 diagnostic regime with Task Arithmetic and TIES-Merging, this component constitutes a substantial share of the native shortfall across the tested regimes, including structured Ward decompositions and random class-to-task partitions on both ViT-B/16 and ResNet-50. A separate geometry control shows that centroid-routed modular composition has a complementary boundary: it outperforms naive ensembling under structured Ward geometry on both backbones, but its advantage disappears or reverses under random class partitions. These results show that model-merging evaluation needs not only better merge operators, but a stricter evidential contract: post-merge scores should be read as delivered-model verdicts, not self-localising diagnoses of representation failure.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Piyush_Rai1

Submission Number: 8964

Loading