Composable Sparse Subnetworks via Maximum-Entropy Principle

Francesco Caso; Samuele Fonio; Nicola Saccomanno; Simone Monaco; Fabrizio Silvestri

Composable Sparse Subnetworks via Maximum-Entropy Principle

Francesco Caso, Samuele Fonio, Nicola Saccomanno, Simone Monaco, Fabrizio Silvestri

Published: 30 Sept 2025, Last Modified: 30 Sept 2025Mech Interp Workshop (NeurIPS 2025) PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Circuit analysis, Interpretability tooling and software, Automated interpretability, Other

Other Keywords: Modular neural networks, Maximum entropy principle, Iterative magnitude pruning

TL;DR: We present a technique to extract class-specific subnetworks behaving as reusable functional modules, which can be combined by simply summing their weights.

Abstract: Neural networks implicitly learn class-specific functional modules. In this work, we ask: Can such modules be isolated and recombined? We introduce a method for training sparse networks that accurately classify only a designated subset of classes while remaining deliberately uncertain on all others, functioning as class-specific subnetworks. A novel KL-divergence-based loss, combined with an iterative magnitude pruning procedure, encourages confident predictions when the true class belongs to the assigned set, and uniform outputs otherwise. Across multiple datasets (MNIST, Fashion MNIST, tabular data) and architectures (shallow and deep MLPs, CNNs), we show that these subnetworks achieve high accuracy on their target classes with minimal leakage to others. When combined via weight summation, these specialized subnetworks act as functional modules of a composite model that often recovers generalist performance. We experimentally confirm that the resulting modules are mode-connected, which justifies summing their weights. Our approach offers a new pathway toward building modular, composable deep networks with interpretable functional structure.

Submission Number: 266

Loading