Fairness-Aware Mixture of Experts with Interpretability Budgets

Joe Germino, Nuno Moniz, Nitesh V. Chawla

Published: 2023, Last Modified: 07 Jan 2026DS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As artificial intelligence becomes more pervasive, explainability and the need to interpret machine learning models’ behavior emerge as critical issues. Discussions are usually bounded by those who defend that interpretable models must be the rule or that non-interpretable models’ ability to capture more complex patterns warrants their use. In this paper, we argue that interpretability should not be viewed as a binary aspect and that, instead, it should be viewed as a continuous domain-informed notion. With this aim, we leverage the well-known Mixture of Experts architecture with user-defined budgets for the controlled use of non-interpretable models. We extend this idea with a counterfactual fairness module to ensure the selection of consistently fair experts: FairMOE . We compare our proposal to contemporary approaches in fairness-related data sets and demonstrate that FairMOE is competitive with the state-of-the-art methods when considering the trade-off between predictive performance and fairness while providing competitive scalability and, most importantly, greater interpretability .