Denoising Pretrained Black-box Models via Amplitude-Guided Phase Realignment

TMLR Paper5861 Authors

10 Sept 2025 (modified: 20 Sept 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Pre-trained models tend to inherit noisy label information from their training datasets, internalising it as biased knowledge. While learning with label noise has been explored, existing approaches rarely address the mitigation of biased knowledge embedded in pre-trained representations introduced by noisy labels. Moreover, existing denoising methods invariably rely on modifying training datasets or models to improve downstream task performance. However, we observe a growing trend in which both pre-trained models and their training datasets are scaling up significantly and becoming increasingly inaccessible, making modifications ever more infeasible. In this paper, we propose a black-box biased knowledge mitigation method called ``Lorem'', which leverages feature frequency amplitudes to guide phase correction on pre-trained representations, without access to training data or model parameters. We first present empirical evidence that, across different noise levels, the phase components of pre-trained representations are more sensitive to noisy labels than the amplitude components, while discriminative information for classification is primarily encoded in the amplitude. Moreover, we find that the impact of noisy labels on amplitude is global, leading to a gradual loss of discriminative information. Therefore, corrective strategies must be adaptive across the entire frequency spectrum rather than limited to the high-frequency components. Inspired by this observation, we design a method that leverages the amplitude residual to realign phase, thereby removing biased knowledge from pre-trained representations. Experiments on a variety of popular pre-trained vision and language models suggest that, even with a simple linear classifier, our method can enhance downstream performance across a range of in-domain and out-of-domain tasks.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Tatsuya_Harada1
Submission Number: 5861
Loading