What Makes Cryptic Crosswords Challenging for LLMs?

What Makes Cryptic Crosswords Challenging for LLMs?

ACL ARR 2024 June Submission4621 Authors

16 Jun 2024 (modified: 04 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Cryptic crosswords are puzzles that rely on general knowledge and the solver's ability to manipulate language on different levels, dealing with various types of wordplay. Previous research suggests that solving such puzzles is a challenge even for modern NLP models. However, the abilities of large language models (LLMs) have not yet been tested on this task. In this paper, we establish the benchmark results for two popular LLMs: {\tt LLaMA3} and {\tt ChatGPT}, showing that their performance on this task is still far from that of humans. We also investigate why the models struggle to achieve superior performance.

Paper Type: Short

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: concept explanations; human-subject application-grounded evaluations; knowledge tracing/discovering/inducing; probing;

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 4621

Loading