Probabilistic Neural-Symbolic Models for Interpretable Visual Question Answering

Ramakrishna Vedantam; Stefan Lee; Marcus Rohrbach; Dhruv Batra; Devi Parikh

Probabilistic Neural-Symbolic Models for Interpretable Visual Question Answering

Ramakrishna Vedantam, Stefan Lee, Marcus Rohrbach, Dhruv Batra, Devi Parikh

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We propose a new class of probabilistic neural-symbolic models for visual question answering (VQA) that provide interpretable explanations of their decision making in the form of programs, given a small annotated set of human programs. The key idea of our approach is to learn a rich latent space which effectively propagates program annotations from known questions to novel questions. We do this by formalizing prior work on VQA, called module networks (Andreas, 2016) as discrete, structured, latent variable models on the joint distribution over questions and answers given images, and devise a procedure to train the model effectively. Our results on a dataset of compositional questions about SHAPES (Andreas, 2016) show that our model generates more interpretable programs and obtains better accuracy on VQA in the low-data regime than prior work.

Keywords: Neural-symbolic models, visual question answering, reasoning, interpretability, graphical models, variational inference

TL;DR: A probabilistic neural symbolic model with a latent program space, for more interpretable question answering

Data: [SHAPES](https://paperswithcode.com/dataset/shapes-1), [Visual Question Answering](https://paperswithcode.com/dataset/visual-question-answering)

22 Replies

Loading