When and how are modular networks better?

Shreyas Malakarjun Patil, Cameron Ethan Taylor, Constantine Dovrolis

Published: 03 Jan 2024, Last Modified: 16 Jan 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Many real-world learning tasks have an underlying hierarchical modular structure, composed of smaller sub-functions. Traditional neural networks (NNs), however, often ignore this structure, leading to inefficiencies in learning and generalization. Leveraging known structural information can enhance performance by aligning the network architecture with the task’s inherent modularity. In this work, we investigate how modular NNs can outperform traditional dense networks by systematically varying the degree of structural knowledge incorporated. We compare architectures ranging from monolithic dense NNs, which assume no prior knowledge, to hierarchically modular NNs with shared modules, which leverage sparsity, modularity, and module reusability. Our experiments demonstrate that incorporating structural knowledge, particularly through module reuse and fixed connectivity, significantly improves learning efficiency and generalization. Hierarchically modular NNs excel in data-scarce scenarios by promoting functional specialization within the modules and reducing redundancy. These findings suggest that task-specific architectural biases can lead to more efficient, interpretable, and effective learning systems.