SSLA: A Generalized Attribution Method for Interpreting Self-Supervised Learning without Downstream Task Dependency

Zhiyu Zhu; Jiayu Zhang; NAN YANG; Xinyi Zhang; Zhibo Jin; Jianlong Zhou; Fang Chen

SSLA: A Generalized Attribution Method for Interpreting Self-Supervised Learning without Downstream Task Dependency

Zhiyu Zhu, Jiayu Zhang, NAN YANG, Xinyi Zhang, Zhibo Jin, Jianlong Zhou, Fang Chen

19 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Interpretability, Attribution, Self-Supervised Learning

Abstract: Self-Supervised Learning (SSL) is a crucial component of unsupervised tasks, enabling the learning of general feature representations without the need for labeled categories. However, our understanding of SSL tasks remains limited, and it is still unclear how SSL models extract key features from raw data. Existing interpretability methods are heavily reliant on downstream tasks, requiring information from these tasks to explain SSL models. This reliance blurs the line between interpreting the SSL model itself and the downstream task model. Moreover, these methods often require additional samples beyond the target of interpretation, introducing extra information that complicates the interpretability process. In this paper, we propose three fundamental prerequisites for the interpretability of SSL tasks and design the Self-Supervised Learning Attribution (SSLA) algorithm that adheres to these prerequisites. SSLA redefines the interpretability objective by introducing a feature similarity measure, reducing the impact of randomness inherent in SSL algorithms, and achieving more stable interpretability results. Additionally, SSLA abstracts the interpretability process, making it independent of specific neural network architectures. To the best of our knowledge, SSLA is the first SSL interpretability method that does not rely on downstream tasks. We also redesign a more reasonable evaluation framework and establish baselines for comparative assessment. The source code for our implementation is publicly available at https://anonymous.4open.science/r/SSLA-EF85.

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1920

Loading