Is ChatGPT a Financial Expert? Evaluating Language Models on Financial Natural Language Processing

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Short Paper
Submission Track: NLP Applications
Submission Track 2: Interpretability, Interactivity, and Analysis of Models for NLP
Keywords: Financial Natural Language Processing, Large Language Models, ChatGPT
Abstract: The emergence of Large Language Models (LLMs), such as ChatGPT, has revolutionized general natural language preprocessing (NLP) tasks. However, their expertise in the financial domain lacks a comprehensive evaluation. To assess the ability of LLMs to solve financial NLP tasks, we present FinLMEval, a framework for Financial Language Model Evaluation, comprising nine datasets designed to evaluate the performance of language models. This study compares the performance of fine-tuned auto-encoding language models (BERT, RoBERTa, FinBERT) and the LLM ChatGPT. Our findings reveal that while ChatGPT demonstrates notable performance across most financial tasks, it generally lags behind the fine-tuned expert models, especially when dealing with proprietary datasets. We hope this study builds foundation evaluation benchmarks for continuing efforts to build more advanced LLMs in the financial domain.
Submission Number: 2825
Loading