Abstract: Despite the digitization of communication flows, information is still commonly exchanged via printed documents. Since manual information extraction becomes inefficient and prone to errors, automatic document processing (ADP) tools have been proposed. Following the current trends in machine learning, many of these tools are based on deep learning methods which require a representative set of training documents to achieve good performance for a target domain. In practice, training documents are often scarce and limited to certain domains which makes it difficult to train a model that generalises well to varying domains. This paper analyses the influence of domain shifts on the performance of document analysis tasks. It further explores the improvements that can be achieved with visual domain adaptation using generative adversarial networks (GANs). The results show that the impact of the domain shift on the performance is not only depending on the difference between the domains but also on the analysis task itself. While some tasks such as document binarization are noticeably affected by the domain shift, other tasks like page classification are less sensitive. It is also shown that the use of mapped training data obtained from a GAN, which translates between the source and target domain, can improve the performance considerably.
0 Replies
Loading