Adaptive Vision-Language Prompt Learners for Learning with Noisy Labels
Abstract: Training deep learning models requires a large volume of diverse data that is typically labeled by humans. However, large-scale data labeling often introduces label noise, leading to degradation in the performance of deep neural networks. Recently, pre-trained models on extensive multi-modal data have shown remarkable performance in computer vision tasks. However, their use to tackle the problem of learning with noisy labels is still in the infancy, due to high computational complexity and training costs. In this work, we propose a novel approach, AVL-Prompter, to effectively leverage vision-language-pre-trained models for learning with noisy labels. The key idea of our method is the use of shared deep learnable prompts, which allow us to effectively adapt large V-L models to the downstream task of learning with noisy labels. Our technique exhibits superior performance compared to state-of-the-art methods in several datasets with both synthetic and real label noise. Our contribution comes from a novel, simple, but highly efficient methodological path to learning with noisy labels while remaining straightforward to implement.
Loading