Accept or Deny? Evaluating LLM Performance and Fairness in Loan Approval

Accept or Deny? Evaluating LLM Performance and Fairness in Loan Approval

ACL ARR 2025 February Submission2427 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) are increasingly employed in high-stakes decision-making tasks, such as loan approvals. Despite their expanding applications across various domains, LLMs continue to struggle with processing tabular data, ensuring fairness, and delivering reliable predictions. In this work, we assess the effectiveness of LLMs in loan approval, with a particular focus on their zero-shot and in-context learning (ICL) capabilities.Specifically, we evaluate the performance of several LLMs on loan approvals using datasets from three geographical locations, namely Ghana, Germany and the United States. We analyze the impact of different serialization formats, such as JSON, and natural language-like text, on model performance and fairness. Our results indicate that LLMs perform significantly worse than classical machine learning models in zero-shot classification tasks, often displaying a tendency to either approve or reject all loan applications. While ICL improves performances of models by 17-27% (relative), its impact on fairness remains inconsistent. Our work underscores the importance of effective tabular data representation methods and fairness-aware models to improve the reliability of LLMs in financial decision-making.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Large Language Models, Loan Approval, Ethics, Bias, and Fairness, Financial Large language models, Financial NLP

Contribution Types: Model analysis & interpretability, Data analysis

Languages Studied: English

Submission Number: 2427

Loading