\section{Discussion}

In this work, we presented an adversarial perturbation scheme that strengthens direct confidence prediction in medical image segmentation under domain shifts. It is based on the idea that the alignment between predicted and actual accuracy of a segmentation model on out-of-distribution data can be improved by learning the effects of adversarial perturbations. At the same time, our training scheme widens the value range of quality metrics that are observed during training, and thus facilitates prediction of scores that are lower than those from the original training images.

Our approach does not require any changes to the underlying segmentation network and has negligible computational overhead during inference. Moreover, we describe an efficient algorithm for training, re-using computations that are anyway required for training the predictor to generate adversarial perturbations.

On two MRI datasets and for an overlap- and a boundary-based quality score, our adversarial scheme improves the confidence prediction baseline across all metrics. In terms of correlation and eAURC, which do not require absolute estimates, it narrows the gap to the more computation-heavy score agreement method \cite{score-agreement} and sometimes even surpasses it.

While the relative improvement from our contribution is evident, \rev{it does not yet establish a new benchmark for failure detection, and} we observe a limitation with respect to the remaining absolute errors on the PMRI dataset (Figure~\ref{fig:results}). The fact that they are still substantially larger than in M\&M is unsurprising, given that the degradations that our adversarial perturbations aim for ($\Delta_s = 0.1$) are far lower than the real shift we observe in the data (see Figure~\ref{fig:unet-eval}). Extending our framework to include more severe perturbations is thus an obvious goal for future work. It is clear from our ablation that this cannot simply be achieved by scaling up $\Delta_s$, but will require a procedure that is more complex than taking a single gradient step.

\rev{In summary, we believe that our work demonstrates the potential of suitably trained direct confidence prediction, even in cases of domain shift. Considering that inference time is up to three orders of magnitude faster than for score agreement, and that we obtain absolute estimates of segmentation quality, which score agreement cannot provide, we consider further refinement of this approach to be a worthwhile goal of future research.}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "../submission"
%%% End:
