Challenges in Region-Specific Image Captioning: A Deep Learning ApproachDownload PDF


16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Region-specific image captioning is the task of generating a caption from an image such that the caption is about the specific region in that image.This paper describes the challenges involved in region-specific image captioning and provides several methods to utilize the region-specific features to enhance the quality of the captions in addition to utilizing the features from the whole image. Our experiments on real-world data sets demonstrate that generating region-specific captions is challenging even after utilizing the information specific to the region.We analyze the variables impacting the quality of the captions which include the bounding box size and the region-specific feature extractor.
0 Replies
