Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization

Jinyong Jeong; Hyungu Kahng; Seoung Bum Kim

Multi-Expert Distributionally Robust Optimization for Out-of-Distribution Generalization

Jinyong Jeong, Hyungu Kahng, Seoung Bum Kim

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Out-of-Distribution Generalization, Distributional Shift, Subpopulation Shift, Domain Generalization, Distributionally Robust Optimization

TL;DR: We introduce Multi-Expert DRO (MEDRO), a novel distributionally robust optimization framework with multiple experts explicitly modeling cross-environment risks to improve out-of-distribution generalization against complex distributional shifts.

Abstract: Distribution shifts between training and test data undermine the reliability of deep neural networks, challenging real-world applications across domains and subpopulations. While distributionally robust optimization (DRO) methods like GroupDRO aim to improve robustness by optimizing worst-case performance over predefined groups, their use of a single global classifier can be restrictive when facing substantial inter-environment variability. We propose Multi-Expert Distributionally Robust Optimization (MEDRO), a novel extension of GroupDRO designed to address such complex shifts. MEDRO employs a shared feature extractor with $m$ environment-specific expert classifier heads, and introduces a min-max objective over all $m^{2}$ expert-environment pairings, explicitly modeling cross-environment risks. This expanded uncertainty set captures fine-grained distributional variations that a single classifier might overlook. Empirical evaluations on a range of standard distribution shift benchmarks demonstrate that MEDRO often achieves robust predictive performance compared to existing methods. Furthermore, MEDRO offers practical inference strategies, such as ensembling or gating mechanisms, for typical scenarios where environment labels are unavailable at test time. Our findings suggest MEDRO as a promising step toward resilient and generalizable machine learning under real-world distribution shifts.

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 21178

Loading