Keywords: Open vocabulary segmentation, Evaluation
Abstract: In Open Vocabulary Semantic Segmentation (OVS), we observe a consistent drop
in model performance as the query vocabulary set expands, especially when it
includes semantically similar and ambiguous vocabularies, such as ‘sofa’ and
‘couch’. The previous OVS evaluation protocol, however, does not account for
such ambiguity, as any mismatch between model-predicted and human-annotated
pairs is simply treated as incorrect on a pixel-wise basis. This contradicts the open
nature of OVS, where ambiguous categories may both be correct from an open-
world perspective. To address this, in this work, we study the open nature of OVS
and propose a mask-wise evaluation protocol that is based on matched and mis-
matched mask pairs between prediction and annotation respectively. Extensive
experimental evaluations show that the proposed mask-wise protocol provides a
more effective and reliable evaluation framework for OVS models compared to the
previous pixel-wise approach on the perspective of open-world. Moreover, analy-
sis of mismatched mask pairs reveals that a large amount of ambiguous categories
exist in commonly used OVS datasets. Interestingly, we find that reducing these
ambiguities during both training and inference enhances capabilities of OVS mod-
els. These findings and the new evaluation protocol encourage further exploration
of the open nature of OVS, as well as broader open-world challenges. Project page: https://qiming-huang.github.io/RevisitOVS/.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11105
Loading