PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization

alexis guichemerre; Soufiane Belharbi; Mohammadhadi Shateri; Luke McCaffrey; Eric Granger

PixelCAM: Pixel Class Activation Mapping for Histology Image Classification and ROI Localization

alexis guichemerre, Soufiane Belharbi, Mohammadhadi Shateri, Luke McCaffrey, Eric Granger

Published: 27 Mar 2025, Last Modified: 05 Jun 2025MIDL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Learning, Image Classification, Visual Interpretability, Weakly Supervised Object Localization, Histology Images

TL;DR: A WSOL model that improves the interpretability of image classification by introducing a pixel-wise classifier to accurately delineate regions of interest in the pixel- feature space.

Abstract: Weakly supervised object localization (WSOL) methods allow training models to classify images and localize ROIs. WSOL only requires low-cost image-class annotations yet provides a visually interpretable classifier, which is important in histology image analysis. Standard WSOL methods rely on class activation mapping (CAM) methods to produce spatial localization maps according to a single- or two-step strategy. While both strategies have made significant progress, they still face several limitations with histology images. Single-step methods can easily result in under- or over-activation due to the limited visual ROI saliency in histology images and scarce localization cues. They also face the well-known issue of asynchronous convergence between classification and localization tasks. The two-step approach is sub-optimal because it is constrained to a frozen classifier, limiting the capacity for localization. Moreover, these methods also struggle when applied to out-of-distribution (OOD) datasets. In this paper, a multi-task approach for WSOL is introduced for simultaneous training of both tasks to address the asynchronous convergence problem. In particular, localization is performed in the pixel-feature space of an image encoder that is shared with classification. This allows learning discriminant features and accurate delineation of foreground/background regions to support ROI localization and image classification. We propose PixelCAM, a cost-effective foreground/background pixel-wise classifier in the pixel-feature space that allows for spatial object localization. Using partial-cross entropy, PixelCAM is trained using pixel pseudo-labels collected from a pretrained WSOL model. Both image and pixel-wise classifiers are trained simultaneously using standard gradient descent. In addition, our pixel classifier can easily be integrated into CNN- and transformer-based architectures without any modifications. Our extensive experiments1 on GlaS and CAMELYON16 cancer datasets show that PixelCAM can improve classification and localization performance when integrated with different WSOL methods. Most importantly, it provides robustness on both tasks for OOD data linked to different cancer types, with large domain shifts between training and testing image data.

Primary Subject Area: Interpretability and Explainable AI

Secondary Subject Area: Learning with Noisy Labels and Limited Data

Paper Type: Methodological Development

Registration Requirement: Yes

Reproducibility: https://github.com/AlexisGuichemerreCode/PixelCAM

Midl Latex Submission Checklist: Ensure no LaTeX errors during compilation., Created a single midl25_NNN.zip file with midl25_NNN.tex, midl25_NNN.bib, all necessary figures and files., Includes \documentclass{midl}, \jmlryear{2025}, \jmlrworkshop, \jmlrvolume, \editors, and correct \bibliography command., Did not override options of the hyperref package, Did not use the times package., All authors and co-authors are correctly listed with proper spelling and avoid Unicode characters., Author and institution details are de-anonymized where needed. All author names, affiliations, and paper title are correctly spelled and capitalized in the biography section., References must use the .bib file. Did not override the bibliographystyle defined in midl.cls. Did not use \begin{thebibliography} directly to insert references., Tables and figures do not overflow margins; avoid using \scalebox; used \resizebox when needed., Included all necessary figures and removed *unused* files in the zip archive., Removed special formatting, visual annotations, and highlights used during rebuttal., All special characters in the paper and .bib file use LaTeX commands (e.g., \'e for é)., Appendices and supplementary material are included in the same PDF after references., Main paper does not exceed 9 pages; acknowledgements, references, and appendix start on page 10 or later.

Latex Code: zip

Copyright Form: pdf

Submission Number: 154

Loading