Abstract: Existing semi-supervised learning methods in medical imaging assume that unlabeled and labeled data share identical classes. However, in real-world medical scenarios, unlabeled datasets often contain novel categories not present in the labeled data. To address this problem, we propose MedGCD (Generalized Category Discovery for Medical Images), a method that identifies seen categories in labeled data and clusters novel categories in unlabeled data. Specifically, MedGCD introduces a dual stream of strong views in a weak-to-strong framework coupled with a confidence-aware pairwise objective for discovering novel categories. This dual view approach enhances feature extraction from unlabeled data, while the confidence-aware pairwise objective ensures the selection of reliable samples, enabling effective clustering of novel categories. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed model in discovering novel categories while maintaining consistent performance on seen categories, with improvements in novel category ranging from 4% to 15%, leading to an overall accuracy improvement of 2% to 8% (https://github.com/Chandan-IITI/MedGCD).
External IDs:dblp:conf/miccai/DasGAYLS25
Loading