Language Guided Interpretable Image Recognition via Manifold Alignment

20 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Explainable AI, Prototypes, Manifold Alignment
Abstract: Most works of interpretable neural networks strive for learning the semantics concepts merely from single modal information such as images. However, humans usually learn semantic concepts from multiple modalities and the semantics is encoded by the brain from fused multi-modal information. Inspired by cognitive science and vision-language learning, we propose a two-stream model for learning visual semantic concepts under the guidance of natural language, where a CNN-based vision stream encodes the input image and a Bert-based language stream encodes corresponding text description. Therefore, visual and natural language features reside on different but semantically highly correlated manifolds, \ie follow a multi-manifold distribution. We transform the multi-manifold distribution alignment problem into updating the projection matrices by Cayley transform on the Stiefel manifold and better joint representations are obtained by fusing the semantically similar features from the aligned manifold. In addition, we propose a Manifold Alignment based Prototypical Part Network (MA-ProtoPNet) to learn the semantics concepts from the joint representations, and these concepts can capture more semantic information from multi-modality. We verified the effectiveness of the manifold alignment method through experiments and the proposed framework can provide better interpretability and classification results.
Supplementary Material: zip
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2416
Loading