Keywords: Interpretability, deep learning, gated modular neural networks, modular neural networks, mixture of experts, model debugging, error attribution
TL;DR: Exploring the interpretability of gated modular neural networks for model debugging, what we can and cannot do.
Abstract: Monolithic deep learning models are typically not interpretable, and not easily transferable. They also require large amounts of data for training the millions of parameters. Alternatively, modular neural networks (MNN) have been known to solve these very issues of monolithic neural networks. However, to date, research in MNN architectures has concentrated on their performance and not on their interpretability. We would like to address this gap in research in MNN architectures, specifically in the gated modular neural network architectures (GMNN). Intuitively, GMNN could inherently be more interpretable since the gate can learn insightful problem decomposition, individual modules can learn simpler functions appropriate to the decomposition and errors can be attributed either to gating or to individual modules thereby providing either a gate level or module level diagnosis. Wouldn’t that be nice? But is this really the case? In this paper we empirically analyze what each module and gate in a GMNN learns and show that (1) GMNNs can indeed be interpretable, but (2) current GMNN architectures and training methods do not necessarily guarantee an interpretable and transferable task decomposition. Experiments are performed on a simple generated, MNIST and FashionMNIST datasets to show that the current architectures fail to perform good interpretable task decompositions even on simple datasets. The code for all the experiments in this paper is at: https://github.com/yamsgithub/modular_deep_learning/ tree/xai_neurips_2021.