Mixture of causal experts: A causal perspective to build dual-level mixture-of-experts models

Published: 01 Jan 2025, Last Modified: 31 Jul 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•MoCE: dual-level MoE with deconfounding abilities for model generalization.•Micro: FFN-MoE causal expert enhancement; Macro: Scalable deconfounding integration.•Evaluated on VQA2.0, e-SNLI-VE, NLVR2, ECtHR, MoCE surpasses widespread baselines.
Loading