Abstract: There is an unmet need to evaluate the language difficulty of short, conversational passages of text, particularly for training and filtering Large Language Models (LLMs). We introduce Ace-CEFR, a novel dataset comprising 890 English conversational text passages, each annotated with its corresponding level of text difficulty. We experiment with a variety of models on Ace-CEFR, including finetuning Transformer-based models and prompting LLMs. Our best model achieves accuracy surpassing human experts and has latency appropriate to production environments. Finally, we release the Ace-CEFR dataset to the public for further research and development.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: generative models, word embeddings, representation learning, few-shot learning, reinforcement learning
Contribution Types: Data resources, Data analysis
Languages Studied: English
Submission Number: 446
Loading