Adaptive Mixture of Disentangled Experts for Dynamic Graphs under Distribution Shifts

ICLR 2026 Conference Submission982 Authors

Published: 26 Jan 2026, Last Modified: 26 Jan 2026ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dynamic Graph Neural Network; Out of Distribution Generalization; Mixture of Experts
TL;DR: We propose a novel adaptive mixture-of-experts framework that dynamically routes disentangled architecture experts to evolving distribution shifts for dynamic graph.
Abstract: Dynamic graph representation learning under distribution shifts has drawn an increasing amount of attention in the research community, given its wide applicability in real-world scenarios. Existing methods typically employ a fixed-architecture design to extract invariant patterns. However, there may exist evolving distribution shifts in dynamic graphs, leading to suboptimal performance of fixed-architecture designs. To address this issue, we propose a novel adaptive-architecture design to handle evolving distribution shifts over time, to the best of our knowledge, for the first time. The proposed adaptive-architecture design introduces an adaptive mixture of architecture experts to capture invariant patterns under evolving distribution shifts, which imposes three challenges: 1) How to detect and characterize evolving distribution shifts to inform architectural decisions; 2) How to dynamically route different expert architectures to handle varying distribution characteristics; 3) How to ensure that the adaptive mixture of experts effectively discovers invariant patterns. To solve these challenges, we propose a novel \underline{\textbf{Ada}}ptive \underline{\textbf{Mix}}ture of Disentangled Experts (AdaMix) model to adaptively route architecture experts to varying distribution shifts and jointly learn spatio-temporal invariant patterns. Specifically, we propose a spatio-temporal distribution detector to infer evolving distribution shifts by jointly leveraging historical and current information. Building upon this, we develop a prototype-guided mixture of disentangled experts that adaptively routes experts with disentangled factors to different distribution shifts. Finally, we design a distribution-aware intervention mechanism that discovers invariant patterns based on expert selection of nodes. Extensive experiments on both synthetic and real-world datasets demonstrate that our proposed (AdaMix) model significantly outperforms state-of-the-art baselines.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 982
Loading