Keywords: BERT, Fine Tuning, Legal, Documents, Contracts, Understanding
TL;DR: Fine-tuning BERT on legal corpora provides marginal, but valuable, improvements on NLP tasks in the legal domain.
Abstract: Fine-tuning language models, such as BERT, on domain specific corpora has proven to be valuable in domains like scientific papers and biomedical text. In this paper, we show that fine-tuning BERT on legal documents similarly provides valuable improvements on NLP tasks in the legal domain. Demonstrating this outcome is significant for analyzing commercial agreements, because obtaining large legal corpora is challenging due to their confidential nature. As such, we show that having access to large legal corpora is a competitive advantage for commercial applications, and academic research on analyzing contracts.