Retained Singular Values in Probabilistic Image Segmentation with Normalizing Flows and Optimal Transport

TMLR Paper240 Authors

06 Jul 2022 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Latent probabilistic models are a popular choice for quantifying aleatoric uncertainty in image segmentation tasks. This uncertainty is typically modelled as latent axis-aligned Normal densities. However, we find that the singular values of the modeled densities can vanish and result in a poor, inexpressive latent space. Deterministic self-supervised models have achieved state-of-the-art results by optimizing embeddings on a projected space to successfully retain the latent singular values. In this work, we extend this approach to the probabilistic setting by introducing the Conditional Sinkhorn Auto-encoder (cSAE). It is shown that with Normalizing Flows and Optimal Transport theory, we can project the latent space and improve the learned embeddings of supervised conditional probabilistic segmentation models. We show that this is due to the singular values of the learned Normal densities being better retained, thereby improving the ability to accurately model and quantify the data uncertainty.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We have noticed confusion arising from lacking mathematical notation and the absence of a proper introduction. We sincerely apologize for this and have therefore made the following changes general changes/additions, which hopefully adress many points mentioned by the reviewers: - updated our introduction to be more concise and straightforward - added multiple subsections that introduce the (c)VAE and PU-Net - described the vanishing of the singular values more explicitly - fixed notation - reduced blank spaces - moved the outlier cases to the Appendix Furthermore, we had found a systematic error in the evaluation of the metrics and have fixed this. For the Emperical Wasserstein, this has only shifted the value ranges. The cSAE still performs best. Also, the Gini index values now better reflect what can be seen in Figures 3 and 4.
Assigned Action Editor: ~marco_cuturi2
Submission Number: 240
Loading