Iterative Magnitude Pruning Reduces Weight-Space Coupling

Published: 24 May 2026, Last Modified: 28 May 2026ICML 2026 Workshop WSS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Lottery Ticket Hypothesis; Iterative Magnitude Pruning; Fisher Information Matrix; Weight-Space Geometry; Natural Gradient
TL;DR: Iterative magnitude pruning selects sparse subnetworks with less coupled, more coordinate-separable Fisher geometry than size-matched unpruned models.
Abstract: Neural networks contain many redundant and approximately equivalent parameterizations, yet it remains unclear how pruning affects this weight-space structure. We study iterative magnitude pruning through Fisher geometry and introduce a per-dimension log-determinant ratio that measures parameter coupling. Across several vision architectures and datasets, pruning consistently reduces this coupling over broad sparsity ranges. Controlled comparisons with smaller unpruned networks suggest that the effect is not explained by parameter count alone. We further show that as Fisher coupling decreases, Adam updates become more aligned with natural-gradient updates, while the same isn't true for SGD. These results suggest that winning tickets are not only sparse and trainable, but also occupy locally simpler, more coordinate-separable regions of weight space.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 45
Loading