Abstract: We present NewsImages, a dataset of online news items, and the related NewsImages rematching task. The goal of NewsImages is to provide researchers with a means of studying the depiction gap, which we define to be the difference between what an image literally depicts and the way in which it is connected to the text that it accompanies. Online news is a domain in which the image-text connection is known to be indirect: The news article does not describe what is literally depicted in the image. We validate NewsImages with experiments that show the dataset's and the task's use for studying occurring connections between image and text, as well as addressing the depiction gap, which include sparse data, diversity of content, and importance of background knowledge.
0 Replies
Loading