Unified Multi-modal Learning for Any Modality Combinations in Alzheimer's Disease Diagnosis

Published: 01 Jan 2024, Last Modified: 14 May 2025MICCAI (3) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Our method solves unified multi-modal learning in an diverse and imbalanced setting, which are the key features of medical modalities compared to the extensively-studied ones. Different from existing works that assumed fixed or maximum number of modalities for multi-modal learning, our model not only manages any missing scenarios but is also capable of handling new modalities and unseen combinations. We argue that, the key towards this any combination model is the proper design of alignment, which should guarantee both modality invariance across diverse inputs and effective modeling of complementarities within the unified metric space. Instead of exact cross-modal alignment, we propose to decouple these two functions into representation-level and task-level alignment, which we empirically show are both indispensable in this task. Moreover, we introduce tunable modality-agnostic Transformer to unify the representation learning process, which significantly reduces modality-specific parameters and enhances the scalability of our model. The experiments have shown that the proposed method enables one single model handling all possible combinations of the six seen modalities and two new modalities in Alzheimer’s Disease diagnosis, with superior performance on longer combinations.
Loading