How Can Knowledge of a Task’s Modular Structure Improve Generalization and Training Efficiency?

Shreyas Malakarjun Patil, Cameron Ethan Taylor, Constantine Dovrolis

Published: 14 Aug 2024, Last Modified: 16 Jan 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Many real-world learning tasks have an underlying hierarchical and modular structure, composed of smaller sub-functions. Traditional neural networks (NNs) often disregard this structure, leading to inefficiencies in learning and generalization. Prior work has demonstrated that leveraging known structural information can enhance performance by aligning NN architectures with the task’s inherent modularity. However, the extent of prior structural knowledge required to achieve these performance improvements remains unclear. In this work, we investigate how modular NNs can outperform traditional dense NNs on tasks with simple yet known modular structure by systematically varying the degree of structural knowledge incorporated. We compare architectures ranging from monolithic dense NNs, which assume no prior knowledge, to hierarchically modular NNs with shared modules that leverage sparsity, modularity, and module reusability. Our experiments demonstrate that module reuse in modular NNs significantly improves learning efficiency and generalization. Furthermore, we find that module reuse enables modular NNs to excel in data-scarce scenarios by promoting functional specialization within modules and reducing redundancy.