A Language Anchor-Guided Method for Robust Noisy Domain Generalization

A Language Anchor-Guided Method for Robust Noisy Domain Generalization

TMLR Paper5027 Authors

04 Jun 2025 (modified: 08 Nov 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Real-world machine learning applications are often hindered by two critical challenges: distribution shift and label noise. Networks inherently tend to overfit to redundant, uninformative features present in the training distribution, which undermines their ability to generalize effectively to the target domain's distribution. The presence of noisy data further exacerbates this issue by inducing additional overfitting to noise, causing existing domain generalization methods to fail in effectively distinguishing invariant features from spurious ones. To address these challenges, we propose Anchor Alignment and Adaptive Weighting (A3W), a novel algorithm based on sample reweighting guided by natural language processing (NLP) anchors that seeks to extract representative features. In particular, A3W leverages semantic representations derived from natural language models to serve as a source of domain-invariant prior knowledge. We also introduce a weighted loss function that dynamically adjusts the contribution of each sample based on its distance to the corresponding NLP anchor, thereby improving the model’s resilience to noisy labels. Extensive experiments on benchmark datasets demonstrate that A3W outperforms state-of-the-art domain generalization methods, yielding significant improvements in both accuracy and robustness across various datasets and noise levels.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: We are grateful to the reviewer for the insightful feedback. The manuscript has been revised in line with the suggestions, and the modifications are indicated in blue text in the updated submission.

Assigned Action Editor: ~Wenbing_Huang1

Submission Number: 5027

Loading