Temporal Knowledge-Aware Image CaptioningDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Contextualized image captioning is a task that extends beyond generating a purely visual description of the image content and aims to produce a caption that is influenced by the context and informed by the real world knowledge. In this paper, we present an approach to knowledge-aware image captioning, with a specific focus on the temporal domain. We propose a way to identify relevant information in external data sources, such as geographic databases and common knowledge bases, and then encode it in a way that is most useful for the captioning network. We develop an end-to-end caption generation system that incorporates external knowledge into the captioning process at several stages. The system is trained and tested on our novel temporal knowledge-aware captioning dataset, achieving significant improvements over multiple baselines across standardly used metrics. We demonstrate that our approach is effective for generating highly contextualized captions with both relevant and accurate temporal facts.
0 Replies

Loading