Training-Free Dataset Pruning for Instance Segmentation

Yalun Dai; Lingao Xiao; Ivor Tsang; Yang He

Training-Free Dataset Pruning for Instance Segmentation

Yalun Dai, Lingao Xiao, Ivor Tsang, Yang He

Published: 22 Jan 2025, Last Modified: 29 Apr 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: dataset pruning, instance segmentation

TL;DR: A fast and training-free dataset pruning method for instance segmentation

Abstract: Existing dataset pruning techniques primarily focus on classification tasks, limiting their applicability to more complex and practical tasks like instance segmentation. Instance segmentation presents three key challenges: pixel-level annotations, instance area variations, and class imbalances, which significantly complicate dataset pruning efforts. Directly adapting existing classification-based pruning methods proves ineffective due to their reliance on time-consuming model training process. To address this, we propose a novel **T**raining-**F**ree **D**ataset **P**runing (**TFDP**) method for instance segmentation. Specifically, we leverage shape and class information from image annotations to design a Shape Complexity Score (SCS), refining it into a Scale-Invariant (SI-SCS) and Class-Balanced (CB-SCS) versions to address instance area variations and class imbalances, all without requiring model training. We achieve state-of-the-art results on VOC 2012, Cityscapes, and MS COCO datasets, generalizing well across CNN and Transformer architectures. Remarkably, our approach accelerates the pruning process by an average of **1349$\times$** on COCO compared to the adapted baselines.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 989

Loading