Supervised Knowledge May Hurt Novel Class Discovery Performance
Event Certifications: lifelong-ml.cc/CoLLAs/2023/Journal_Track
Abstract: Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset by leveraging prior knowledge of a labeled set comprising disjoint but related classes. Given that most existing literature focuses primarily on utilizing supervised knowledge from a labeled set at the methodology level, this paper considers the question: Is supervised knowledge always helpful at different levels of semantic relevance? To proceed, we first establish a novel metric, so-called transfer leakage, to measure the semantic similarity between labeled/unlabeled datasets. To show the validity of the proposed metric, we build up a large-scale benchmark with various degrees of semantic similarities between labeled/unlabeled datasets on ImageNet by leveraging its hierarachical class structure. The results based on the proposed benchmark show that the proposed transfer leakage is in line with the hierarachical class structure; and that NCD performance is consistent with the semantic similarities (measured by the proposed metric). Next, by using the proposed transfer leakage, we conduct various empirical experiments with different levels of semantic similarity, yielding that supervised knowledge may hurt NCD performance. Specifically, using supervised information from a low-similarity labeled set may lead to a suboptimal result as compared to using pure self-supervised knowledge. These results reveal the inadequacy of the existing NCD literature which usually assumes that supervised knowledge is beneficial. Finally, we develop a pseudo-version of the transfer leakage as a practical reference to decide if supervised knowledge should be used in NCD. Its effectiveness is supported by our empirical studies, which show that the pseudo transfer leakage (with or without supervised knowledge) is consistent with the corresponding accuracy based on various datasets.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We made several revisions in this paper and added an Acknowledgements section: * We added more experiments in Figure 3 and Section 5.2.2 to examine the effect from labeled data with self-supervised pretraining setting. More detailed discussion can be found in Section 5.2.2. * In Section 5.2.2, we provide additional experiments and details. Based on the results presented in Figure 3, two conclusions can be drawn: (i) Supervised knowledge may lead to inferior performance during both pretraining and training. (ii) The effectiveness of incorporating supervised knowledge in the training process is closely related to the degree of similarity between the labeled and unlabeled datasets. * In Section 2, we added more discussion of related work and their comparisons to our results. * We revised the discussion part about why supervised knowledge hurts novel class discovery performance in the Appendix. * The computation cost of the method is added in the Appendix. * We made a more accurate and descriptive term for "transfer leakage" as "transfer flow". * In Remark 1, we removed “arbitrary” and revised it as “can be extended to a more general representation estimated based on supervised or self-supervised information from a labeled dataset.” * We added more details about high, medium, and low similarity settings in the experiment settings of Section 5. * Acknowledgements: We would like to express our gratitude to all the reviewers for their valuable feedback and suggestions, which significantly improved the quality of the paper. We also thank our collaborators for their support and insightful discussions during the development of this work. ---- Minor changes: 1. Slight adjustments to the figure positioning 2. Modify the title of section 5.1 to "Experimental Design."
Supplementary Material: zip
Assigned Action Editor: ~Vikas_Sindhwani1
Submission Number: 765