Keywords: Anomaly Segmentation, Semi-supervised Learning
TL;DR: Experiments to show that residual errors as anomaly scoring function for medical grayscale images is problematic.
Abstract: Many current state-of-the-art methods for anomaly detection in medical images rely on calculating a residual image between a potentially anomalous input image and its ("healthy") reconstruction. As the reconstruction of the unseen anomalous region should be erroneous, this yields large residuals as a score to detect anomalies in medical images. However, this assumption does not take into account residuals resulting from imperfect reconstructions of the machine learning models used. Such errors can easily overshadow residuals of interest and therefore strongly question the use of residual images as scoring function. Our work explores this fundamental problem of residual images in detail. We theoretically define the problem and thoroughly evaluate the influence of intensity and texture of anomalies against the effect of imperfect reconstructions in a series of experiments.
Registration: I acknowledge that publication of this at MIDL and in the proceedings requires at least one of the authors to register and present the work during the conference.
Authorship: I confirm that I am the author of this work and that it has not been submitted to another publication before.
Paper Type: validation/application paper
Primary Subject Area: Segmentation
Secondary Subject Area: Unsupervised Learning and Representation Learning
Confidentiality And Author Instructions: I read the call for papers and author instructions. I acknowledge that exceeding the page limit and/or altering the latex template can result in desk rejection.
Code And Data: Code: https://github.com/FeliMe/residual-score-pitfalls Data: https://www.synapse.org/#!Synapse:syn21343101/wiki/599515