THINK VISUALLY: QUESTION ANSWERING THROUGH VIRTUAL IMAGERYDownload PDF

Anonymous

07 Nov 2017 (modified: 07 Apr 2024)ICLR 2018 Conference Blind SubmissionReaders: Everyone
Abstract: In this paper, we study the problem of visual reasoning in the context of textual question answering. We introduce Dynamic Spatial Memory Networks (DSMN), a new deep network architecture that specializes in answering questions that admit latent visual representations, and learns to generate and reason over such representations. Further, we propose two synthetic benchmarks, HouseQA and ShapeIntersection, to evaluate the visual reasoning capability of textual QA systems. Experimental results validate the effectiveness of our proposed DSMN for visual reasoning tasks.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:1805.11025/code)
Withdrawal: Confirmed
0 Replies

Loading