POPAR: Patch Order Prediction and Appearance Recovery for Self-supervised Medical Image AnalysisDownload PDFOpen Website

01 Nov 2022OpenReview Archive Direct UploadReaders: Everyone
Abstract: Vision transformer-based self-supervised learning (SSL) approaches have recently shown substantial success in learning visual representations from unannotated photographic images. However, their acceptance in medical imaging is still lukewarm, due to the significant discrepancy between medical and photographic images. Consequently, we propose POPAR (patch order prediction and appearance recovery), a novel vision transformer-based self-supervised learning framework for chest X-ray images. POPAR leverages the benefits of vision transformers and unique properties of medical imaging, aiming to simultaneously learn patch-wise high-level contextual features by correcting shuffled patch orders and fine-grained features by recovering patch appearance. We transfer POPAR pretrained models to diverse downstream tasks. The experiment results suggest that (1) POPAR outperforms state-of-the-art (SoTA) self-supervised models with vision transformer backbone; (2) POPAR achieves significantly better performance over all three SoTA contrastive learning methods; and (3) POPAR also outperforms fully-supervised pretrained models across architectures. In addition, our ablation study suggests that to achieve better performance on medical imaging tasks, both fine-grained and global contextual features are preferred. All code and models are available at GitHub.com/JLiangLab/POPAR.
0 Replies

Loading