Parallel-fusion LSTM with synchronous semantic and visual information for image captioning

Jing Zhang, Kangkang Li, Zhe Wang

Published: 2021, Last Modified: 20 Jul 2025J. Vis. Commun. Image Represent. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•A novel parallel-fusion LSTM is proposed for image captioning.•We propose two structures pLSTM-A and pLSTM-G according to fusion strategy.•pLSTM-A pays attention to the crucial information at different time step.•Each gate of visual LSTM is guided by synchronous semantic attribute in pLSTM-G.•Experimental results prove that our model performs state of the art methods.