Study of Question Answering on Legal Software Document using BERT based models

Ernesto Quevedo Caballero; Mushfika Sharmin Rahman; Tomas Cerny; Pablo Rivas; Gissella Bejarano

Study of Question Answering on Legal Software Document using BERT based models

Ernesto Quevedo Caballero, Mushfika Sharmin Rahman, Tomas Cerny, Pablo Rivas, Gissella Bejarano

27 May 2022 (modified: 05 May 2023)LXNLP 2022 MinorrevisionsReaders: Everyone

Keywords: Question Answering, Transformers, Privacy Policies

TL;DR: Study of Question Answering on Legal Software Document using BERT based models

Abstract: The transformer-based architectures have achieved remarkable success in several Natural Language Processing tasks, such as the Question Answering domain. Our research focuses on different transformer-based language models' performance in software development legal domain specialized datasets for the Question Answering task. It compares the performance with the general-purpose Question Answering task. We have experimented with the PolicyQA dataset and conformed to documents regarding users' data handling policies, which fall into the software legal domain. We used as base encoders BERT, ALBERT, RoBERTa, DistilBERT and LEGAL-BERT and compare their performance on the Question answering benchmark dataset SQuAD V2.0 and PolicyQA. Our results indicate that the performance of these models as contextual embeddings encoders in the PolicyQA dataset is significantly lower than in the SQuAD V2.0. Furthermore, we showed that surprisingly general domain BERT-based models like ALBERT and BERT obtain better performance than a more domain-specific trained model like LEGAL-BERT.

Submission Type: Archival

Volunteer As A Reviewer: Yes

0 Replies

Loading