Document Layout Analysis with Variational Autoencoders: An Industrial Application

Ali Youssef, Gabriele Valvano, Giacomo Veneri

Published: 2022, Last Modified: 09 Jul 2024ISMIS 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We present a novel method for Document Layout Analysis that detects documents that are not compliant with a given template. The major challenge we solve is dealing with a highly unbalanced dataset with only a few, hard-to-distinguish, non-compliant documents. Our model learns to detect inadequate documents based on localised non-compliant characteristics, including stamps, handwritten text, and misplaced signatures. Nevertheless, the model must not report documents containing other artefacts such as amendments or notes, which we deem acceptable. We address these challenges via generative modelling, using anomaly detection techniques to validate document layout. In particular, we first let the model learn the compliant document distribution. Then, we detect and report out-of-distribution samples for their automated rejection. In the paper, we investigate and compare two major approaches to anomaly detection: 1) classifying anomalies as those samples that cannot be accurately generated by the model; and 2) detecting samples whose mapping to a known proxy distribution is not possible. Both methods can be trained without annotations and obtain a classification accuracy of \(\sim \)90% on real-world documents, outperforming alternative supervised solutions.