Abstract: The growing complexity and volume of legal documents necessitate the development of efficient solutions for information management in the legal field. Visual Question Answering (VQA) shows promise in addressing this challenge, but applying VQA to legal documents presents unique difficulties due to their intricate language and context. To facilitate research in this area, we introduce LawViVQA, a dataset specifically designed for visual question answering in the legal domain. LawViVQA comprises 5,200 Vietnamese legal document images representing a broad spectrum of legal documents, and 23,000 corresponding question-answer pairs. This dataset facilitates extensive VQA research, addressing the unique challenges posed by legal document interpretation. We evaluate several baseline methodologies, ranging from basic neural network models to advanced Scene Text VQA and Natural Language Processing techniques. These evaluations provide critical insights into the effectiveness and applicability of various approaches in understanding and responding to questions related to legal document imagery. Our study underscores the potential of VQA as a tool for enhancing the management, processing, and retrieval of information from legal documents, ultimately contributing to more efficient and accurate legal practices.
Loading