Evaluating LLMs' capability on Satisfying Lexical Constraint

ACL ARR 2024 June Submission5813 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Lexical Constrained Generation (LCG) is a fundamental task in text generation. Recent advancement of large pretrained language models (LLMs) has enabled prompt-based controlling for LCG. Despite growing interest in assessing LLMs' capabilities in various aspects, there remains a lack of thorough investigation. To address this gap, we systematically analyze the performance of LLMs on satisfying lexical constraints with prompt-based controlling, as well as their efficacy in downstream applications (such as recipe generation, table-to-text, profile writing, etc). Through extensive experimentation, we identified several key observations that elucidate the limitations of LLMs in LCG, including (1) position bias, where LLMs tend to satisfy constraints that appear in specific positions within the input; (2) insensitive decoding parameters, which minimally impact the performance of LLMs; and (3) the inherent complexity of certain constraints (i.e. compound word). We conclude that there is a complexity bottleneck: LLMs still face significant challenges in consistently satisfying lexical constraints. Additionally, we introduce the Divide and Conquer Generation strategy, effective for both white-box and black-box LLMs, significantly enhancing their performance in LCG tasks. This strategy boosts LLMs' success rate by 93% in the most challenging LCG task, which is 40% more than the baseline. Our analysis aims to provide valuable insights into the performance of LLMs in LCG, and our proposed strategy offers a pathway to more sophisticated and customized text generation applications.
Paper Type: Long
Research Area: Generation
Research Area Keywords: analysis, automatic evaluation, text-to-text generation, data-to-text generation, inference method
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 5813
Loading