Detecting Environmental Infractions and their Impacts Caused by Industrial Sectors

Keywords: Environmental Violation, Impact detection, Document Classification, Active Learning, Multi-Tasking Network
TL;DR: Detecting Environmental Infractions and their Impacts by Industries
Abstract: Environmental practices of an organization reflects its commitments to the world environment, and societal good. Institutional investors take regulatory violations into account for decision making purposes, since these factors are known to affect public opinion and thereby the stock indices of companies. Typically, risk scores are derived based on information published in the reports filed by companies, News articles and social media posts, analyst reactions along with customized surveys. Though this involves churning large volumes of textual information, not much use of language technologies is reported by practitioners for information extraction and classification for detecting environmental violations by organizations. In this paper, we present a transformer based multi-task network to help detect environmental violations from Online News articles and classify them into respective environmental impacts. We have created an annotated corpus using articles published over last 8 years, mostly by regulatory and governing agencies across different countries, for the purpose. Due to the paucity of data, we have adopted an active learning framework. We observed the models to performs better at each round when new, clean human annotations are added. Both the incident classification and extraction methods achieve state-of-the-art accuracy, as measured using cross-validation techniques.
