How To Train Your Covariance

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Unsupervised Heteroscedastic Covariance Estimation, Spatial Variance, Correlation, Conditional Mean Absolute Error
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: We study the problem of _unsupervised heteroscedastic covariance estimation_, where the goal is to learn the multivariate target distribution $\mathcal{N}(y, \Sigma_y | x )$ given an observation $x$. This problem is particularly challenging as $\Sigma_{y}$ varies for different samples (heteroscedastic) and no annotation for the covariance is available (unsupervised). Typically, state-of-the-art methods predict the mean $f(x ; \theta)$ and covariance $Cov(f(x); \Theta)$ of the target distribution through two neural networks trained using the negative log-likelihood. This raises two questions: (1) Does the predicted covariance truly capture the randomness of the predicted mean? (2) In the absence of ground-truth annotation, how can we quantify the performance of covariance estimation? We address (1) by developing the __Spatial Variance__, a formulation of $Cov(f(x); \Theta)$ that captures the randomness in $ f(x ; \theta)$ by incorporating its curvature around $x$. Furthermore, we tackle (2) by introducing the _Conditional Mean Absolute Error (C-MAE)_, a metric which leverages well-known properties of the normal distribution. We verify the effectiveness of our approach through multiple experiments spanning synthetic (univariate, multivariate) and real-world datasets (UCI Regression, LSP, and MPII Human Pose Estimation). Our experiments provide evidence that our approach outperforms the state of the art across these datasets and multiple network architectures, and accurately learns the relation underlying the target random variables.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7380
Loading