CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tuomas Oikarinen; Tsui-Wei Weng

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tuomas Oikarinen, Tsui-Wei Weng

Published: 01 Feb 2023, Last Modified: 14 Jan 2026ICLR 2023 notable top 25%Readers: Everyone

Keywords: Interpretability, Explainability, Network Dissection

Abstract: In this paper, we propose CLIP-Dissect, a new technique to automatically describe the function of individual hidden neurons inside vision networks. CLIP-Dissect leverages recent advances in multimodal vision/language models to label internal neurons with open-ended concepts without the need for any labeled data or human examples. We show that CLIP-Dissect provides more accurate descriptions than existing methods for last layer neurons where the ground-truth is available as well as qualitatively good descriptions for hidden layer neurons. In addition, our method is very flexible: it is model agnostic, can easily handle new concepts and can be extended to take advantage of better multimodal models in the future. Finally CLIP-Dissect is computationally efficient and can label all neurons from five layers of ResNet-50 in just 4 minutes, which is more than 10$\times$ faster than existing methods. Our code is available at https://github.com/Trustworthy-ML-Lab/CLIP-dissect.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)

TL;DR: We propose an automated method for generating descriptions of the representation learned by hidden layer neurons, leveraging the multimodal CLIP-model.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/clip-dissect-automatic-description-of-neuron/code)

20 Replies

Loading