THINK VISUALLY: QUESTION ANSWERING THROUGH VIRTUAL IMAGERY

Anonymous

THINK VISUALLY: QUESTION ANSWERING THROUGH VIRTUAL IMAGERY

Anonymous

07 Nov 2017 (modified: 13 Apr 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: In this paper, we study the problem of visual reasoning in the context of textual question answering. We introduce Dynamic Spatial Memory Networks (DSMN), a new deep network architecture that specializes in answering questions that admit latent visual representations, and learns to generate and reason over such representations. Further, we propose two synthetic benchmarks, HouseQA and ShapeIntersection, to evaluate the visual reasoning capability of textual QA systems. Experimental results validate the effectiveness of our proposed DSMN for visual reasoning tasks.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/think-visually-question-answering-through/code)

Withdrawal: Confirmed

0 Replies

Loading