Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnic, Yannis Kalantidis, Giorgos Tolias

Published: 01 Jan 2024, Last Modified: 07 Dec 2024CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Vision-Language Models (VLMs) have demonstrated im-pressive performance on zero-shot classification, i.e. classi-fication when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data. We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification. We tailor LP to graphs containing both text and image features and further pro-pose an efficient method for performing inductive infer-ence based on a dual solution and a sparsification step. We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works. Code: https://github.com/vladan-stojnic/ZLaP