Learning Robust Representations for Medical Images via Unifying (Self-)Supervisions

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: medical image pre-training, medical image representation learning
Abstract: Pre-training medical image encoder to provide robust, task-agnostic representations is highly valuable, as it enhances the understanding of medical images and is important for performing many data-scarce analysis tasks. Current pre-training works are unable to integrate various types of supervisions, including self-supervision and external supervision such as segmentation annotations, while they are highly valuable for medical image understanding. Therefore, in this paper, we take the first step toward exploring unifying all common types of supervisions into a pre-training framework through a same scalable way. This require the pre-training framework being both unified, for accommodating diverse data and extensible, and effective, for making heterogeneous data synergistically assist unknown downstream tasks. To this end, we propose UmiF, whose principle is that once converted into token embeddings in a unified space, all diverse supervisions can be effectively utilized via contrastive learning and mask modeling with a same way. With UmiF, we pre-train on 1.66M samples from 14 public datasets, significantly surpassing previous efforts in terms of the dataset scale. We obtain and release the UmiF model, which achieved state-of-the-art performance across various downstream tasks, including classification, segmentation, and detection, retrieval and VQA.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10881
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview