CLEAR-WSI: Foundation Model Empowered Whole Slide Image Retrieval

20 Nov 2025 (modified: 14 Feb 2026)Submitted to MIDL 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Computational Pathology, Image Retrieval, Reverse Image Search, Foundation Models, Vision Transformers, Whole Slide Images, Deep Learning
TL;DR: A fully automated, patch-count–independent reverse image search engine for H&E whole-slide images that leverages ViT foundation models, AttentionMIL aggregation, and self-review label filtering to achieve state-of-the-art retrieval and rank quality.
Abstract: The rapid growth of digital pathology has produced vast repositories of hematoxylin and eosin stained whole slide images, yet most of them remain unindexed or unlabelled, limiting their utility for computational analysis. Reverse image search provides a scalable way to organize and access these archives by retrieving visually similar images. While currently deployed retrieval systems exist, they rely on manual configuration, highly affecting their performance. Thus, we propose CLEAR-WSI, Constant Length Embedding \& Automatic Retrieval, a fully automated pathology reverse image search engine that leverages Vision Transformer foundation models for histopathology together with attention-based multiple instance learning (AttentionMIL). The AttentionMIL framework jointly identifies diagnostically relevant whole slide images and predicts slide-level diagnoses. To further improve performance, we introduce a self-reviewing classifier filtering mechanism: retrieved candidates are filtered according to their predicted labels, mostly outperforming class-informed filters. Across two public datasets, CAMELYON16 (lymph node metastases) and BRACS (breast cancer subtypes), our method establishes new state-of-the-art results, improving $Acc_{MV}@5$ from 77.49\% to 89.92\% on CAMELYON16, from 54.12\% to 75.86\% on BRACS level-1, and from 36.47\% to 51.72\% on BRACS level-2. Our general-purpose, annotation-free, dataset-agnostic, search engine that scales across diverse data sources is openly available: https://github.com/youssefwally/CLEAR-WSI
Primary Subject Area: Foundation Models
Secondary Subject Area: Application: Histopathology
Registration Requirement: Yes
Reproducibility: https://github.com/youssefwally/CLEAR-WSI
Visa & Travel: Yes
Read CFP & Author Instructions: Yes
Originality Policy: Yes
Single-blind & Not Under Review Elsewhere: Yes
LLM Policy: Yes
Submission Number: 32
Loading