Policy gradient-based optimal subset selection for few-shot vision-language learning

Muhammad Khizer Ali, Manoranjan Paul, Anwaar Ulhaq, Muhammad Haris Khan, Quazi Mamun

Published: 01 Jan 2025, Last Modified: 27 Feb 20262025 IEEE International Conference on Image Processing, ICIP 2025 - ProceedingsEveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Vision-language models (VLMs) like Contrastive Language-Image Pre-training (CLIP) have been extensively adapted for few-shot classification. Most few-shot methods rely on randomly selected samples from the dataset. However, since only a few samples are used, the sample selection process can significantly impact the performance of the downstream classification task. In this work, we propose a reinforcement learning-based policy gradient technique that employs a diversity and informativeness-based reward function to optimise the sample selection process. We evaluate various sample selection techniques based on downstream classification accuracy across three benchmark datasets, where the proposed method demonstrates promising results.

External IDs:doi:10.1109/icip55913.2025.11084464