Abstract: Feature-based methods are commonly used to explain model predictions, but these methods often implicitly assume that interpretable features are readily available. However, this is often not the case for high-dimensional data, and it can be hard even for domain experts to mathematically specify which features are important. Can we instead automatically extract collections or groups of features that are aligned with expert knowledge? To address this gap, we present FIX (Features Interpretable to eXperts), a benchmark for measuring how well a collection of features aligns with expert knowledge. In collaboration with domain experts, we propose FIXScore, a unified expert alignment measure applicable to diverse real-world settings across cosmology, psychology, and medicine domains in vision, language, and time series data modalities. With FIXScore, we find that popular feature-based explanation methods have poor alignment with expert-specified knowledge, highlighting the need for new methods that can better identify features interpretable to experts.
Keywords: Interpretable Features, Explainability
Changes Since Last Submission: We added in representative examples of features extracted by existing methods, along with commentary on how they differ from expert-aligned features (located in Appendix D).
Moreover, to provide more guidance for adding new settings to the benchmark, we included a step-by-step walkthrough in the revised manuscript to make this process more accessible to future researchers (located in Appendix E).
Changes Since Previous Publication: N/A
Video: https://drive.google.com/file/d/1whnGQagAQxLEPEeAESN7mLHPBArWALxe/view?usp=sharing
Code: https://github.com/BrachioLab/exlib/tree/main/fix
Assigned Action Editor: ~Hugo_Jair_Escalante1
Submission Number: 88
Loading