Doc2Graph-X: A Multilingual Graph-Based Framework for Form Understanding

Souparni Mazumder, Sanket Biswas, Alloy Das, Josep Lladós

Published: 2025, Last Modified: 26 Feb 2026GbRPR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Graph Neural Networks (GNNs) have advanced key information extraction (KIE) in document AI, but existing methods remain monolingual. We introduce Doc2Graph-X, a multilingual extension of Doc2Graph, leveraging word-level and sentence-level embeddings for robust cross-lingual document representation. Our framework constructs graph-based structures where a node classifier performs semantic entity recognition (SER) and an edge classifier handles relation extraction (RE) to predict links between entities. Evaluated on the XFUND dataset across seven languages, Doc2Graph-X outperforms existing baselines, demonstrating strong multilingual adaptability. Additional results on FUNSD validate its effectiveness in monolingual settings. Our approach enables structured multilingual document understanding while preserving task-agnostic adaptability. The code (https://github.com/biswassanket/doc2graph_multi.git) will be made available upon acceptance.
Loading