Keywords: Single Domain Generalization, Modality-agnostic Augmentation
TL;DR: We propose a stochastic latent noise perturbation module with MMD constraints that achieves semantic preserving augmentation and improves single domain generalization across vision and speech.
Abstract: Single domain generalization (SDG) is challenging because models trained on a single domain often suffer from out-of-distribution (OOD) shifts at inference time. Existing augmentation techniques often sacrifice semantic consistency for diversity or vice versa, and are largely confined to vision tasks. We propose a Stochastic Latent Noise Perturbation Module (SLNP) that automatically computes multiple MMD thresholds based on the source domain’s intra- and inter-class statistics, and then maximizes the sum of noise under these adaptive bounds. This unified objective generates diverse yet semantically faithful samples, applied independently of the downstream training loop—without requiring adversarial training or auxiliary loss terms. In addition, SLNP complements normalization methods, yielding synergistic improvements when the two are combined. Furthermore, our method is modality-agnostic and applicable to any distribution-based data. Experiments on image benchmark demonstrate that our approach integrates easily into existing pipelines and improves state-of-the-art SDG baselines, and additional results on speech data show its applicability beyond the vision domain.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 7246
Loading