Abstract: Vision-language models (VLMs) like Contrastive Language-Image Pre-training (CLIP) have been extensively adapted for few-shot classification. Most few-shot methods rely on randomly selected samples from the dataset. However, since only a few samples are used, the sample selection process can significantly impact the performance of the downstream classification task. In this work, we propose a reinforcement learning-based policy gradient technique that employs a diversity and informativeness-based reward function to optimise the sample selection process. We evaluate various sample selection techniques based on downstream classification accuracy across three benchmark datasets, where the proposed method demonstrates promising results.
External IDs:doi:10.1109/icip55913.2025.11084464
Loading