Keywords: Generalized Discovery Category, Computer Vision, Image Classification
Abstract: Generalized Category Discovery (GCD) is a task that concentrates on identifying both base and novel categories in an unlabeled dataset while preserving knowledge from a labeled dataset. A key challenge in this setting lies in balancing the supervised learning of base categories and the unsupervised discovery of novel ones. Through empirical analysis, we observe that conventional multi-objective optimization approaches suffer from significant gradient interference between the classification objective and the representation objective, which hinders effective joint training. Therefore, we propose a simple yet effective framework, named $\textbf{P}$areto-guided $\textbf{G}$radient $\textbf{I}$nterference ($\textbf{PGI}$), to alleviate this issue. The PGI employs a Pareto-annealing optimization approach to explore the Pareto front that balances representation objective and classification objective. Additionally, a regularization term is introduced which can leverage multi-view consistency to enhance clustering structure in the feature space, facilitating better separation of novel classes. Extensive experiments across fine-grained benchmarks demonstrate the superiority of our approach in discovering novel categories while maintaining accuracy on base classes.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 7447
Loading