WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

Hong Liu; Haosen Yang; Paul J. van Diest; Josien P.W. Pluim; Mitko Veta

WSI-SAM: Multi-resolution Segment Anything Model (SAM) for histopathology whole-slide images

Hong Liu, Haosen Yang, Paul J. van Diest, Josien P.W. Pluim, Mitko Veta

Published: 16 Jul 2024, Last Modified: 27 Aug 2024COMPAYL 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Foundation Models, Computational pathology, Whole-slide images

Abstract: The Segment Anything Model (SAM) marks a significant advancement in segmentation models, offering robust zero-shot abilities and dynamic prompting. However, existing medical SAMs are not suitable for the multi-scale nature of whole-slide images (WSIs), restricting their effectiveness. To resolve this drawback, we present WSI-SAM, enhancing SAM with precise object segmentation capabilities for histopathology images using multi-resolution patches, while preserving its efficient, prompt-driven design, and zero-shot abilities. To fully exploit pretrained knowledge while minimizing training overhead, we keep SAM frozen, introducing only minimal extra parameters and computational overhead. In particular, we introduce High-Resolution (HR) token, Low-Resolution (LR) token and dual mask decoder. This decoder integrates the original SAM mask decoder with a lightweight fusion module that integrates features at multiple scales. Instead of predicting a mask independently, we integrate HR and LR token at intermediate layer to jointly learn features of the same object across multiple resolutions. Experiments show that our WSI-SAM outperforms state-of-the-art SAM and its variants. In particular, our model outperforms SAM by 4.1 and 2.5 percent points on a ductal carcinoma in situ (DCIS) segmentation tasks and breast cancer metastasis segmentation task (CAMELYON16 data set). The code will be available at https://github.com/HongLiuuuuu/WSI-SAM.

Submission Number: 5

Loading