Abstract: Asking clarifying questions has become a key element of various conversational systems, allowing for an effective resolution of ambiguity and uncertainty through natural language questions. Despite the extensive applications of spatial information grounded dialogues, it remains an understudied area on learning to ask clarification questions with the capability of spatial reasoning. In this work, we propose a novel method, named SpatialCQ, for this problem. Specifically, we first align the representation space between textual and spatial information by encoding spatial states with textual descriptions. Then a multi-relational graph is constructed to capture the spatial relations and enable spatial reasoning with relational graph attention networks. Finally, a unified encoder is adopted to fuse the multimodal information for asking clarification questions. Experimental results on the latest IGLU dataset show the superiority of the proposed method over existing approaches.
0 Replies
Loading