Keywords: Semantics Extraction, Industry Academia Collaboration, Transformers
TL;DR: Collaboration structure and benefits for developing a complex system that can help legal actors improve productivity
Abstract: Extracting insights from text documents and developing predictive models for analytics is of critical importance in several domains. However, it is a challenging task owing to the diversity in linguistic characteristics of large scale text corpora, exacerbated by a lack of labeled data. We present here a case-study on extracting semantics from complex legal and regulatory documents and applying them to perform analytical tasks such as violation detection and penalty estimation. Our system was developed in a joint academic-industry collaboration effort and benefited from their complementary research strengths. Specifically, the domain expertise and problem formulation process in the industrial setting were combined with the exploratory research and experimentation rigor of the educational world to develop a system that can help legal actors improve their productivity. We outline our collaboration mechanism, detail the techniques used and functionalities developed, and also discuss the key take-aways that can benefit the research community.