Keywords: Generalisation; machine learning
Abstract: Large pre-trained models are increasingly important tools in machine learning. Although versatile, their knowledge is however limited and often insufficient to handle domain-specific nuances. This paper introduces a novel ``webly-supervised'' learning approach that uses web search to enrich a pre-trained model for visual recognition. This strategy empowers the model to access relevant, up-to-date information as required. Our method first identifies test instances that the pre-trained model is uncertain about. We then formulate a query for Google Search to retrieve images to resolve this uncertainty. These serve as noisy data to train a compact classifier, with no need for additional manual labelling.
While multiple attempts at search-augmented learning appeared in the past, this iteration of the concept benefits from recent advances in NLP and multi-modal learning. This allows demonstrating unique benefits in uncertainty quantification and domain-specific recognition (e.g. +15 percentage points in accuracy on the Stanford Cars and Flowers datasets). We also present extensive experiments to explore the impact of noisy retrieval and different fine-tuning strategies.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 796
Loading