[Re] On the Reproducibility of Post-Hoc Concept Bottleneck Models

TMLR Paper2212 Authors

15 Feb 2024 (modified: 26 Apr 2024)Decision pending for TMLREveryoneRevisionsBibTeX
Abstract: To obtain state-of-the-art performance, many deeper artificial intelligence models sacrifice human explainability in their decision-making. One solution proposed for achieving top performance and retaining explainability is the Post-Hoc Concept Bottleneck Model (PCBM) (Yuksekgonul et al., 2023), which can convert the embeddings of any deep neural network into a set of human-interpretable concept weights. In this work, we reproduce and expand upon the findings of Yuksekgonul et al. (2023). Our results show that while most of the authors’ claims and results hold, some of the results they obtained could not be sufficiently replicated. Specifically, the claims relating to PCBM performance preservation and its non-requirement of labeled concept datasets were generally reproduced, whereas the one claiming its model editing capabilities was not. Beyond these results, our contributions to their work include evidence that PCBMs may work for audio classification problems, verification of the interpretability of their methods, and updates to their code for missing implementations. The code for our implementations can be found in https://github.com/Anonymous9834257/Anonymous.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. Discussed the significance of the reported improvements in Table 3 ("Edit Gain"). 2. Defined "<.,.>" as the dot product and made necessary corrections in equations and notations. 3. Addressed concerns regarding equation formatting and notation consistency. 4. Clarified the intention behind the binary classifier representation using vector \(\omega\) and corrected equation parentheses. 5. Ensured clarity on the usage of percentage points versus percent (%). 6. Linked discussions in page 8 to the relevant tables and repeated results in Table 1 in Table 2 for comparison. 7. Ensured that results for the user study in Claim 3 refer to the corresponding tables. 8. Suggested inclusion of results from "meaningful" concepts in Table 5 for comparison. 9. Added several sentences to the introduction summarizing the findings and intentions behind the random projection experiment. 10. Addressed interpretability aspects by suggesting citation of relevant papers and further explanation of the discrepancy between CLIP and CAV concepts. 11. Added additional results utilizing a method to obtain CAVs without an explicitly annotated dataset. Adjusted the discussion to reflect these results 12. Added an additional dataset to the audio classification results 12. Ensured consistent terminology for the SIIM-ISIC dataset. 13. Indicated references to figures and tables included in the appendix in the main text for readability.
Assigned Action Editor: ~Sanghyuk_Chun1
Submission Number: 2212
Loading