Measuring the Interpretability of Unsupervised Representations via Quantized Reversed ProbingDownload PDF

Anonymous

Sep 29, 2021 (edited Nov 23, 2021)ICLR 2022 Conference Blind SubmissionReaders: Everyone
  • Keywords: Representation learning, Computer vision, Interpretability
  • Abstract: Self-supervised visual representation learning has attracted significant research interest. While the most common way to evaluate self-supervised representations is through transfer to various downstream tasks, we instead investigate the problem of measuring their interpretability, i.e. understanding the semantics encoded in the raw representations. We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts. To quantify this we introduce a decoding bottleneck: information must be captured by simple predictors, mapping concepts to clusters of data formed in representation space. This approach, which we call reverse linear probing, provides a single number sensitive to the semanticity of the representation. This measure is also able to detect when the representation correlates with combinations of labelled concepts (e.g. "red apple") instead of just individual attributes ("red" and "apple" separately). Finally, we also suggest that supervised classifiers can be used to automatically label large datasets with a rich space of attributes. We use these insights to evaluate a large number of self-supervised representations, ranking them by interpretability, and highlight the differences that emerge compared to the standard evaluation with linear probes.
  • One-sentence Summary: We propose quantized reverse probing as a information-theoretic measure to assess the degree to which self-supervised visual representations align with human-interpretable concepts.
14 Replies

Loading