Global Sharpness-Aware Minimization Is Suboptimal in Domain Generalization: Towards Individual Sharpness-Aware Minimization

ICLR 2026 Conference Submission16125 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Domain generalization, domain shift, sharpness-aware minimization
TL;DR: While SAM seeks flat minima for domain generalization, it may converge to fake flat minima by ignoring sharpness in individual domains. To address this, we propose DGSAM, a gradual and efficient domain-wise sharpness minimization method.
Abstract: Domain generalization (DG) aims to learn models that perform well on unseen target domains by training on multiple source domains. Sharpness-Aware Minimization (SAM), known for finding flat minima that improve generalization, has therefore been widely adopted in DG. However, we argue that the prevailing approach of applying SAM to the aggregated loss for domain generalization is fundamentally suboptimal. This ``global sharpness'' objective can be deceptive, leading to convergence to fake flat minima where the total loss surface is flat, but the underlying individual domain landscapes remain sharp. To establish a more principled objective, we analyze a worst-case risk formulation that reflects the true nature of DG. Our analysis reveals that individual sharpness provides a valid upper bound on this risk, while global sharpness does not, making it a more theoretically grounded target for robust domain generalization. Motivated by this, we propose \textit{Decreased-overhead Gradual SAM (DGSAM)}, which applies gradual, domain-wise perturbations to effectively control individual sharpness in a computationally efficient manner. Extensive experiments demonstrate that DGSAM not only improves average accuracy but also reduces performance variance across domains, while incurring less computational overhead than SAM.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 16125
Loading