TSVFN: Two-Stage Visual Fusion Network for multimodal relation extraction

Published: 01 Jan 2023, Last Modified: 11 Apr 2025Inf. Process. Manag. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A two-stage fusion method is proposed, which combines GNN and transformers.•Taking full advantage of two modalities information by two ways in the second stage.•Vision-language alignment vector is employed to enhance multimodal fusion.•Good performance on both full dataset and dataset with fewer samples.
Loading