Part-based bird classifiers with an explainable, editable language bottleneck

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: part-based, explainable, editable, bird classifier, language bottleneck
TL;DR: PEEB, a novel bird classifier that allows users to describe in text the 12 parts of every bird that they want to identify
Abstract: Most CLIP-based image classifiers rely heavily on having known class names in the prompt and therefore are neither explainable nor editable to humans. Here, we present PEEB, a novel bird classifier that allows users to describe in text the 12 parts of every bird that they want to identify. After the textual descriptors are defined, PEEB detects 12 parts of a bird in the image and then computes a matching score between the image and each class by summing over the dot products of 12 pairs of visual and textual part embeddings. Besides editability, our classifier achieves state-of-the-art accuracy in two different zero-shot settings and competitive performance when finetuned on target datasets.
Supplementary Material: pdf
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5768
Loading