Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions

Ohad Amosy; Tomer Volk; Eyal Ben-David; Roi Reichart; Gal Chechik

Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions

Ohad Amosy, Tomer Volk, Eyal Ben-David, Roi Reichart, Gal Chechik

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: zero-shot learning, vision and language

TL;DR: Produce a model for a classification task described in text at inference time without training; using equivariant hypernetworks

Abstract: We study the problem of generating a training-free task-dependent visual classifier from text descriptions without visual samples. This Text-to-Model (T2M) problem is closely related to zero-shot learning, but unlike previous work, a T2M model infers a model tailored to a task, taking into account all classes in the task. We analyze the symmetries of T2M, and characterize the equivariance and invariance properties of corresponding models. In light of these properties we design an architecture based on hypernetworks that given a set of new class descriptions predicts the weights for an object recognition model which classifies images from those zero-shot classes. We demonstrate the benefits of our approach compared to zero-shot learning from text descriptions in image and point-cloud classification using various types of text descriptions: From single words to rich text descriptions.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

5 Replies

Loading