Statistical Evaluation of Competing Models in Brain MRI Analysis

Published: 27 Apr 2024, Last Modified: 27 Apr 2024MIDL 2024 Short PapersEveryoneRevisionsBibTeXCC BY 4.0
Keywords: SOTA, classification, segmentation, result significance, benchmarking, evaluation
Abstract: Current state-of-the-art models in medical image analysis are often evaluated using small, single train/test split datasets, which may not provide a reliable indication of model improvement. This practice, combined with the increasing complexity and computational cost of newer architectures, exacerbates reproducibility issues, particularly in brain MRI imaging. This paper introduces a robust statistical framework to assess model comparisons in medical imaging. We correct common misapplications of the t-test, which assumes data normality, and promote the use of Two One-Sided Tests for demonstrating non-inferiority in model optimization scenarios. Through this approach, we aim to enhance the rigor of model performance evaluations and address the challenges of dataset size and test validity in medical image analysis.
Submission Number: 179
Loading