HiLINK: Hierarchical linking of context-aware knowledge prediction and prompt tuning for bilingual knowledge-based visual question answering
Abstract: Highlights•Simplify the two-stage training into an end-to-end structure for efficiency.•Enables relationships learning via Bayesian network-based contextual awareness.•Facilitates bilingual representation learning via a trainable encoder strategy.•Exhibits superior training effectiveness in a bilingual setting over monolingual.•HiLINK shows outstanding performance on BOK-VQA in all language settings.
External IDs:dblp:journals/kbs/JeongKSH25
Loading