Keywords: Maximum Entropy Principle; Iterative Magnitude Pruning; Circuits; Modular Neural Networks
TL;DR: We present a technique to extract class-specific subnetworks behaving as reusable functional modules, which can be combined by simply summing their weights.
Abstract: Neural networks implicitly learn class-specific functional modules. In this work, we ask: Can such modules be isolated and recombined? We introduce a method for training sparse networks that accurately classify only a designated subset of classes while remaining deliberately uncertain on all others, functioning as class-specific subnetworks. A novel KL-divergence-based loss trains only the functional module for the assigned set, and an iterative magnitude pruning procedure removes irrelevant weights. Across multiple datasets (MNIST, FMNIST, tabular data, CIFAR-10) and architectures (MLPs, CNNs, ResNet, VGG), we show that these subnetworks achieve high accuracy on their target classes with minimal leakage to others. When combined via weight summation or logit averaging, these specialized subnetworks act as functional modules of a composite model that often recovers generalist performance. For simpler models and datasets, we experimentally confirm that the resulting modules are mode-connected, which justifies summing their weights. Our approach offers a new pathway toward building modular, composable deep networks with interpretable functional structure.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 18940
Loading