When and how are modular networks better?

Shreyas Malakarjun Patil; Cameron Ethan Taylor; Constantine Dovrolis

When and how are modular networks better?

Shreyas Malakarjun Patil, Cameron Ethan Taylor, Constantine Dovrolis

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Neural networks, hierarchical modularity, sparsity, generalization, training efficiency

TL;DR: This paper investigates how varying degrees of knowledge about the task's hierarchical and modular structure can be utilized to enhance NN generalization and training efficiency.

Abstract: Many real-world learning tasks have an underlying hierarchical modular structure, composed of smaller sub-functions. Traditional neural networks (NNs), however, often ignore this structure, leading to inefficiencies in learning and generalization. Leveraging known structural information can enhance performance by aligning the network architecture with the task’s inherent modularity. In this work, we investigate how modular NNs can outperform traditional dense networks by systematically varying the degree of structural knowledge incorporated. We compare architectures ranging from monolithic dense NNs, which assume no prior knowledge, to hierarchically modular NNs with shared modules, which leverage sparsity, modularity, and module reusability. Our experiments demonstrate that incorporating structural knowledge, particularly through module reuse and fixed connectivity, significantly improves learning efficiency and generalization. Hierarchically modular NNs excel in data-scarce scenarios by promoting functional specialization within the modules and reducing redundancy. These findings suggest that task-specific architectural biases can lead to more efficient, interpretable, and effective learning systems.

Supplementary Material: pdf

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8163

Loading