Abstract: Highlights•Attribute-driven image captioning joining visual positioning and attribute selection.•Pointing mechanism to merge the attribute detection result into caption generation.•Approach that can well associate the attentional regions with visual attributes.•Experimental results outperform some attribute-based state-of-the-arts.
Loading