Long BERT for bankruptcy prediction

Braulio Cesar Blanco Lambruschini; Mats Brorsson

Long BERT for bankruptcy prediction

Braulio Cesar Blanco Lambruschini, Mats Brorsson

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: pdf

Primary Area: representation learning for computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Finance, BERT, LSTM, Unstructured data, NLP

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: Parallelization and integration of BERT for processing long text information

Abstract: Most bankruptcy risk prediction models use numerical data such as financial statements, financial ratios, or stock market variables to predict the risk of a company going into bankruptcy. However, these models do not take advantage of the vast amount of textual information available. The few projects that work with textual information use short texts such as tweets and news or are limited to analyzing data from public companies. Our research focuses on predicting the bankruptcy risk using the long text sequences of the annexes from the Annual Accounts. We propose a BERT-based model, which can predict the risk of a company going bankrupt even if there is no explicit information about the risk in the long-textual information. Here we showed that we can process parallel segments of a document using BERT and then integrate them for a unified prediction. Using a dataset of 20,000 annexes from the Annual Accounts of non-financial companies from Luxembourg to train and validate our model. We tried different models and two of them get a validation precision for predicting a risky company of approximately 73% and can be used depending on how long the documents are. The model can clearly learn about risk information from unstructured and diverse long textual information with high precision. This is our first step towards an integrated learning model that considers also numerical and non-financial data. Our proposed architecture can be used in other domains where long text needs to be processed for different Natural Language Processing tasks.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1892

Loading