DUE: End-to-End Document Understanding Benchmark

Łukasz Borchmann; Michał Pietruszka; Tomasz Stanislawek; Dawid Jurkiewicz; Michał Turski; Karolina Szyndler; Filip Graliński

DUE: End-to-End Document Understanding Benchmark

Łukasz Borchmann, Michał Pietruszka, Tomasz Stanislawek, Dawid Jurkiewicz, Michał Turski, Karolina Szyndler, Filip Graliński

Published: 11 Oct 2021, Last Modified: 23 May 2023NeurIPS 2021 Datasets and Benchmarks Track (Round 2)Readers: Everyone

Keywords: Document Understanding, Multi-modal Models, Language Models, NLP, Multimodal Data, Key Information Extraction, Question Answering, Information Extraction, Table Comprehension, KIE, NLI, Visual QA, Layout-aware Language Models

TL;DR: Description of a benchmark spanning multiple end-to-end tasks related to understanding multi-modal documents with complex layouts.

Abstract: Understanding documents with rich layouts plays a vital role in digitization and hyper-automation but remains a challenging topic in the NLP research community. Additionally, the lack of a commonly accepted benchmark made it difficult to quantify progress in the domain. To empower research in this field, we introduce the Document Understanding Evaluation (DUE) benchmark consisting of both available and reformulated datasets to measure the end-to-end capabilities of systems in real-world scenarios. The benchmark includes Visual Question Answering, Key Information Extraction, and Machine Reading Comprehension tasks over various document domains and layouts featuring tables, graphs, lists, and infographics. In addition, the current study reports systematic baselines and analyzes challenges in currently available datasets using recent advances in layout-aware language modeling. We open both the benchmarks and reference implementations and make them available at https://duebenchmark.com and https://github.com/due-benchmark.

Supplementary Material: pdf

URL: https://duebenchmark.com/

Contribution Process Agreement: Yes

Dataset Url: https://duebenchmark.com

License: MIT License

Author Statement: Yes

13 Replies

Loading