How do people talk about images? A study on open-domain conversation on images.Download PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Open-domain conversation on images requires the model to consider the relation and balance between utterances and images in order to generate proper responses. This paper explore how human conduct conversation on images by investigating a well-constructed open-domain image conversation dataset, ImageChat. We examine the conversations on images from three perspectives: $\textit{image relevancy}$, $\textit{image information}$ and $\textit{utterance style}$. We show that objects in the image are indeed the most important element for conversations on image, which could be directly discussed or be a bait to other off-image conversations. Thus, being able to accurately detect objects in the image and knowing their attributes are essential to chat on image. Understanding the scenarios of the image, except extracting the image objects, is also a key factor to the conversation on images. Based on our analysis, we propose to enriching the image information with image caption and object tags, increasing the diversity and image-relevancy of generated responses. We believe that our analysis provides useful insights and directions that facilitate future research on open-domain conversation on images.
0 Replies
