Learning to Identify Seen, Unseen and Unknown in the Open World: A Practical Setting for Zero-Shot Learning

Published: 01 Jan 2025, Last Modified: 07 Oct 2025WACV 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As vision-language models advance, addressing the Zero-Shot Learning (ZSL) problem in the open world becomes increasingly crucial. Specifically, a robust model must handle three types of samples during inference: seen classes with visual and semantic information provided in training, unseen classes with only the semantic information in training, and unknown samples with no prior information from training. Existing methods either handle seen and unseen classes together (ZSL) or seen and unknown classes (known as Open-Set Recognition, OSR). However, none addresses the simultaneous handling of all three, which we term Open-Set Zero-Shot Learning (OZSL). To address this problem, we propose a two-stage approach for OZSL that recognizes seen, unseen, and unknown samples. The first stage classifies samples as either seen or not, while the second stage distinguishes unseen from unknown. Furthermore, we introduce a cross-stage knowledge transfer mechanism that leverages semantic relationships between seen and unseen classes to enhance learning in the second stage. Extensive experiments demonstrate the efficacy of the proposed approach compared to naívely combining existing ZSL and OSR methods. The code is available at https://github.com/smufang/OZSL.
Loading