DefNTaxS: The Inevitable Need for More Structured Description in Zero-Shot Classification

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: zero shot, classification, CLIP, VLM, DCLIP, WaffleCLIP, open vocabulary, pretrained
TL;DR: Sometimes with zero-shot classification, class descriptors are a nice appetiser, but class subcategories are the main course.
Abstract: Existing approaches leveraging large pretrained vision-language models (VLMs) like CLIP for zero-shot text-image classification often focus on generating fine-grained class-specific descriptors, leaving higher-order semantic relations between classes underutilised. We address this gap by proposing Defined Taxonomic Stratification (DefNTaxS), a novel and malleable framework that supplements per-class descriptors with inter-class taxonomies to enrich semantic resolution in zero-shot classification tasks. Using large language models (LLMs), DefNTaxS automatically generates subcategories that group similar classes and appends context-specific prompt elements for each dataset/subcategory, reducing inter-class competition and providing deeper semantic insight. This process is fully automated, requiring no manual modifications or further training for any of the models involved. We demonstrate that DefNTaxS yields consistent performance gains across a number of datasets often used to benchmark frameworks of this type, enhancing accuracy and semantic interpretability in zero-shot classification tasks of varying scale, granularity, and type.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13192
Loading