SSLA: A Generalized Attribution Method for Interpreting Self-Supervised Learning without Downstream Task Dependency

ICLR 2025 Conference Submission1920 Authors

19 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Interpretability, Attribution, Self-Supervised Learning
Abstract: Self-Supervised Learning (SSL) is a crucial component of unsupervised tasks, enabling the learning of general feature representations without the need for labeled categories. However, our understanding of SSL tasks remains limited, and it is still unclear how SSL models extract key features from raw data. Existing interpretability methods are heavily reliant on downstream tasks, requiring information from these tasks to explain SSL models. This reliance blurs the line between interpreting the SSL model itself and the downstream task model. Moreover, these methods often require additional samples beyond the target of interpretation, introducing extra information that complicates the interpretability process. In this paper, we propose three fundamental prerequisites for the interpretability of SSL tasks and design the Self-Supervised Learning Attribution (SSLA) algorithm that adheres to these prerequisites. SSLA redefines the interpretability objective by introducing a feature similarity measure, reducing the impact of randomness inherent in SSL algorithms, and achieving more stable interpretability results. Additionally, SSLA abstracts the interpretability process, making it independent of specific neural network architectures. To the best of our knowledge, SSLA is the first SSL interpretability method that does not rely on downstream tasks. We also redesign a more reasonable evaluation framework and establish baselines for comparative assessment. The source code for our implementation is publicly available at https://anonymous.4open.science/r/SSLA-EF85.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1920
Loading