Survey on Question Answering over Visually Rich Documents: Methods, Challenges, and Trends

Camille Barboule, Benjamin Piwowarski, Yoan Chabot

Published: 2025, Last Modified: 28 Apr 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The field of visually-rich document understanding, which involves interacting with visually-rich documents (whether scanned or born-digital), is rapidly evolving and still lacks consensus on several key aspects of the processing pipeline. In this work, we provide a comprehensive overview of state-of-the-art approaches, emphasizing their strengths and limitations, pointing out the main challenges in the field, and proposing promising research directions.