Do LLMs Paint Within the Lines? Evaluating the Ability of LLMs to Follow Fine-grained Text Editing Instructions.
Abstract: Instruction-tuned large language models (LLMs) have shown improved performance on a variety of NLP tasks and are used extensively in many NLP applications, including writing assistants. However, little is known about their ability to consistently and rigorously follow fine-grained instructions, especially text editing instructions. In this work, we comprehensively characterize the ability of popular LLMs to follow fine-grained text editing instructions. By introducing a benchmark suite that enables a controlled evaluation, we show that state-of-the-art LLMs show varied performance and can struggle on even elementary text editing tasks revealing key insights into the limitations of current LLMs. Finally, we show that further instruction tuning on text-editing instruction data can be an effective approach to improve performance on both seen and unseen text-editing tasks.
Paper Type: long
Research Area: NLP Applications
Contribution Types: Model analysis & interpretability
Languages Studied: English
0 Replies
Loading