Holistic Consistency for Subject-level Segmentation Quality Assessment in Medical Image Segmentation
Keywords: Segmentation Quality Assessment, Unsupervised Methods, Segmentation Consistency, Trustworthy Medical AI
Abstract: A reliable/trustworthy image segmentation pipeline plays a central role in deploying AI medical image analysis systems in clinical practice. Given a segmentation map produced by a segmentation model, it is desired to have an automatic, accurate, and reliable method in the pipeline for segmentation quality assessment (SQA) when the ground truth is absent. In this paper, we present a novel holistic consistency based method for assessing at the subject-level the quality of segmentation produced by state-of-the-art segmentation models. Our method does not train a dedicated model using labeled samples to assess segmentation quality; instead, it systematically explores the segmentation consistency in an unsupervised manner. Our approach examines the consistency of segmentation results across three major aspects: (1) consistency across sub-models; (2) consistency across models; (3) consistency across different runs with random dropouts. For a given test image, combining consistency scores from the above mentioned aspects, we can generate an overall consistency score that is highly correlated with the true segmentation quality score (e.g., Dice score) in both linear correlation and rank correlation. Empirical results on two public datasets demonstrate that our proposed method outperforms previous unsupervised methods for subject-level SQA.
Submission Number: 9
Loading