Efficient Representations for Whole Slide Image Classification

Efficient Representations for Whole Slide Image Classification

TMLR Paper4331 Authors

23 Feb 2025 (modified: 27 May 2025)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The advent of digital pathology has transformed diagnostic and research capabilities, offering unprecedented insights through the analysis of high-resolution whole slide images (WSIs). However, the gigapixel size and complexity of WSIs present significant computational challenges. To address this, we propose a scalable and efficient pipeline for WSI classification that integrates patch-based feature extraction, clustering, and advanced representation techniques. Our methodology begins by extracting features from patches identified based on their pathological significance using deep feature embeddings from a pre-trained convolutional neural network (CNN) fine tuned on a histology dataset under noisy labels. This approach ensures that the extracted features are robust and tailored to histopathological patterns despite the inherent noise in the training data. These embeddings are then clustered using K-means clustering to group semantically similar regions. To represent these clusters effectively, we experimented with two strategies: first, using the cluster mean to summarize each cluster; and second, employing Fisher vector (FV) encoding to model the distribution of patch embeddings within clusters using a parametric Gaussian mixture model (GMM). The resulting high-dimensional feature vector encapsulates both local and global tissue structures, enabling robust classification of WSIs. This approach significantly reduces computational overhead while maintaining high accuracy, as validated across multiple datasets. Our innovative framework combines the precision of Fisher vectors with the scalability of clustering, establishing an efficient and precise solution for WSI analysis that advances the practical application of digital pathology in medical diagnostics and research.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Steffen_Schneider1

Submission Number: 4331

Loading