Training-Free Dataset Pruning for Polyp Segmentation via Community Detection in Similarity Networks

Published: 27 Mar 2025, Last Modified: 01 May 2025MIDL 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dataset pruning, Training-free, Similarity network, Community detection, Polyp segmentation
TL;DR: Training-Free Dataset Pruning via Community Detection in Similarity Networks
Abstract: Recent advances in deep learning have been driven by the availability of larger datasets and more complex models; however, this progress comes at the expense of substantial computational and annotation costs. To address these issues, we introduce a novel, training-free dataset pruning method, PRIME, targeting polyp segmentation in medical imaging. To this end, PRIME constructs a similarity network among the images in the target dataset and then applies community detection to retain a much smaller, yet representative subset of images from the original dataset. Unlike existing methods that require model training for dataset pruning, our PRIME completely avoids model training, thus significantly reducing computational demands. The reduction in the training dataset cuts 56.2% data annotation costs and enables 2.3$\times$ faster training of polyp segmentation models, with only a 0.5% drop in the DICE score. Consequently, our PRIME enables efficient training, fine-tuning, and domain adaptation across medical centers, thus offering a cost-effective solution for deep learning in polyp segmentation.
Primary Subject Area: Segmentation
Secondary Subject Area: Application: Endoscopy
Paper Type: Both
Registration Requirement: Yes
Submission Number: 137
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview