Beyond Memorization and Recitation: Evaluating LLMs on Deep Understanding of Ancient Chinese Poetry

ACL ARR 2026 January Submission7268 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, evaluation, traditional Chinese culture
Abstract: Ancient Chinese Poetry (ACP) stands as a brilliant embodiment of cultural heritage, using concise forms to convey profound emotions. While Large Language Models (LLMs) have made rapid progress in mimicking linguistic styles and reciting verses, whether they truly understand the poets' underlying intent remains an open question. Current works primarily focus on knowledge-driven, surface-level understanding, failing to assess the understanding gap between rote memorization and aesthetic appreciation. To address this, we propose CP-DUE (Classical Poem – Deep Understanding Evaluation), a top-down framework that treats poetry comprehension as a five-level progressive process. CP-DUE systematically evaluates LLMs across dimensions ranging from basic cultural facts to precise word choice (\textit{Tui Qiao}), hidden allusions, and overall aesthetic appreciation. Through extensive experiments comparing LLMs with human experts, we reveal that even advanced models struggle with the artistic nuances that define the soul of ACP. This work provides new insights into bridging the understanding gap and enhancing LLMs' competence in genuine cultural connection.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: evaluation
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: Chinese
Submission Number: 7268
Loading