Keywords: ESG; Completeness; Natural language processing; dataset
Abstract: Environmental, Social, and Governance (ESG) reports serve as a platform for companies to publicly disclose their economic, environmental, and social impacts, as well as their contributions to sustainable development goals. The completeness of ESG reports is considered a crucial criterion for judging their quality and credibility, yet it is often overlooked in existing literature. This paper aims to comprehensively assess the completeness of ESG reports by evaluating their topic coverage and text quality. To achieve this goal, we propose two classification tasks: topic classification and quality classification for ESG sentences. To train the classifiers, we collected 14,468 ESG reports from Chinese-listed companies. We then segment them into sentences and label 8K of them with both topic and text quality tags. By fine-tuning several large language models (LLMs) on this dataset on the two classification tasks, we find that our dataset has the potential to fill the gap in academia regarding methods for measuring ESG completeness.
Supplementary Material: pdf
Submission Number: 1457
Loading