Fully exploring object relation interaction and hidden state attention for video captioning

Feiniu Yuan, Sipei Gu, Xiangfen Zhang, Zhijun Fang

Published: 2025, Last Modified: 16 Nov 2025Pattern Recognit. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We design an Object Relation Graph Interaction module (ORGI) for capturing information about objects and their relations.•To implement adequate information flow across all nodes, we specially construct a global node that connects all graph nodes.•We propose a hidden State and Attention Enhanced Decoder (SAED) that concatenates hidden states and updated attentions for improving the prediction ability of next words.

External IDs:dblp:journals/pr/YuanGZF25