Automated Evaluation of the Linguistic Difficulty of Conversational Texts for LLM Applications

Automated Evaluation of the Linguistic Difficulty of Conversational Texts for LLM Applications

ACL ARR 2024 April Submission446 Authors

16 Apr 2024 (modified: 02 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: There is an unmet need to evaluate the language difficulty of short, conversational passages of text, particularly for training and filtering Large Language Models (LLMs). We introduce Ace-CEFR, a novel dataset comprising 890 English conversational text passages, each annotated with its corresponding level of text difficulty. We experiment with a variety of models on Ace-CEFR, including finetuning Transformer-based models and prompting LLMs. Our best model achieves accuracy surpassing human experts and has latency appropriate to production environments. Finally, we release the Ace-CEFR dataset to the public for further research and development.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: generative models, word embeddings, representation learning, few-shot learning, reinforcement learning

Contribution Types: Data resources, Data analysis

Languages Studied: English

Submission Number: 446

Loading