Cardinal Graph Convolution Framework for Document Information ExtractionOpen Website

2020 (modified: 27 May 2022)DocEng 2020Readers: Everyone
Abstract: Graph Convolutional Networks (GCN) have been recognized as successful for processing pseudo-spatial graph representations of the underlying structure of documents. We present Cardinal Graph Convolutional Networks (CGCN), an efficient and flexible extension of GCNs with cardinal-direction awareness of spatial node arrangement. The formulation of CGCNs retains the traditional GCN permutation invariance, ensuring directional neighbors are involved in learning abstract representations, even in the absence of a proper ordering of the nodes. We show that CGCNs achieve state of the art results on an invoice information extraction task, jointly learning a word-level tagging as well as document meta-level classification and regression. We also present a new multiscale Inception-like CGCN block-layer, as well as Conv-Pool-DeConv-DePool UNet-like architecture, which increase the receptive field. We demonstrate the utility of CGCNs on private and public datasets, with respect to several baseline models: sequential LSTM, transformer classifier, non-cardinal GCNs, and an image-convolutional approach.
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview