scMoE: single-cell Multi-Modal Multi-Task Learning via Sparse Mixture-of-Experts

scMoE: single-cell Multi-Modal Multi-Task Learning via Sparse Mixture-of-Experts

ICLR 2026 Conference Submission21200 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: single-cell multi-omics, multi-modal learning, multi-task learning

Abstract: Recent advances in measuring high-dimensional modalities, including protein levels and DNA accessibility, at the single-cell level have prompted the need for frameworks capable of handling multi-omics data while addressing multiple tasks. Despite these advancements, most work remains limited, often focusing on either a single-modal or single-task perspective. A few recent studies have ventured into multi-omics and multi-task learning, but we identified a ① Optimization Conflict issue, leading to suboptimal results when integrating additional modalities in the single-cell domain. Furthermore, there is a ② Costly Interpretability challenge, as current approaches largely rely on costly post-hoc methods like SHAP. Motivated by these challenges, we introduce scMoE, a novel framework that applies Sparse Mixture-of-Experts (SMoE) within the single-cell domain. This is achieved by incorporating an SMoE layer into a transformer block with a cross-attention module. Thanks to its design, scMoE inherently provides mechanistic interpretability, a critical aspect for understanding underlying mechanisms in biological data. Furthermore, from a post-hoc perspective, we enhance interpretability by extending the concept of activation vectors (CAVs) to the singlecell domain. Extensive experiments on simulated dataset, Dyngen, and real-world multi-omics single-cell datasets, including {DBiT-seq, Patch-seq, ATAC+gene}, demonstrate the effectiveness of scMoE.

Supplementary Material: zip

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 21200

Loading