Nonparametric Unsupervised Data Condensation for Gigapixel Histological Images

ICLR 2026 Conference Submission22506 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: unsupervised condensation, probabilistic model, histological imaging
Abstract: Histological whole-slide images (WSIs) are central to computational pathology but are extremely large, often several gigabytes, making them infeasible for direct use in standard vision pipelines. Prior approaches reduce training cost by condensing WSIs into a fixed number of representative features (prototypes), but this approach overlooks the varying complexity and diversity of WSIs, leading to loss of critical information. To this end, we propose **NICER**, a probabilistic data condensation framework that decomposes each WSI into feature patterns to capture heterogeneity and concept prototypes to ensure compactness. By reformulating prototype construction as a nonparametric condensation problem, NICER adapts the number of prototypes to slide complexity while preserving relevant information. Experiments on four histological datasets show that NICER outperforms prior methods, yielding up to 90% performance gains and superior efficiency trade-offs, setting a new paradigm for histological representation learning.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 22506
Loading