Keywords: SSL, Bayesian Inference
Abstract: Practitioners have become aware that self-supervised learning techniques using multiple views (created through augmentation) outperform reconstruction based methods on downstream tasks. Intuitive arguments suggest this is due to the dimensionality of the observation space. Another theoretical line of attack is through work on provable disentanglement under the assumption original image is recoverable from each view. We extend these arguments to the case where the assumptions are dropped. To do this we connect to traditional statistical theory by casting SSL as a method for learning sufficient statistics. This allows us to show when exact recoverability is not possible SSL representations are (information theoretically) equivalent to posterior distributions. We demonstrate, in a toy model with known data generating process, even as the original data becomes corrupted by noise the SSL representations remain correlated with the posterior distribution. We further demonstrate that the representations specifically correlate with the posterior variance, indicating uncertainty is being encoded. We believe this viewpoint can shed new light on the question on when reconstruction methods fail, for example when likelihoods are difficult to represent but sampling is cheap and sufficient statistics are simple.
Submission Number: 107
Loading