Structure-informed Risk Minimization for Robust Ensemble Learning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We leverage distributional graph to learn robust ensemble weights for out-of-distribution generalization.
Abstract: Ensemble learning is a powerful approach for improving generalization under distribution shifts, but its effectiveness heavily depends on how individual models are combined. Existing methods often optimize ensemble weights based on validation data, which may not represent unseen test distributions, leading to suboptimal performance in out-of-distribution (OoD) settings. Inspired by Distributionally Robust Optimization (DRO), we propose Structure-informed Risk Minimization (SRM), a principled framework that learns robust ensemble weights without access to test data. Unlike standard DRO, which defines uncertainty sets based on divergence metrics alone, SRM incorporates structural information of training distributions, ensuring that the uncertainty set aligns with plausible real-world shifts. This approach mitigates the over-pessimism of traditional worst-case optimization while maintaining robustness. We introduce a computationally efficient optimization algorithm with theoretical guarantees and demonstrate that SRM achieves superior OoD generalization compared to existing ensemble combination strategies across diverse benchmarks. Code is available at: https://github.com/deep-real/SRM.
Lay Summary: Machine learning models often fail when deployed in environments different from their training conditions. A common solution is ensemble learning—combining predictions from multiple models like getting several doctors' opinions. However, determining how much weight to give each model's prediction is challenging, as current methods rely on validation data that may not reflect real deployment conditions. We introduce Structure-informed Risk Minimization (SRM), which learns robust ensemble weights by preparing for realistic worst-case scenarios rather than relying on potentially mismatched validation data. Unlike traditional approaches that consider any possible worst case (overly pessimistic), SRM uses knowledge about how different data sources relate to each other to focus on plausible scenarios.
Link To Code: https://github.com/deep-real/SRM
Primary Area: Deep Learning->Robustness
Keywords: Out-of-distribution Generalization
Submission Number: 814
Loading