Abstract: In this paper, we introduce a novel and exciting task: the automated generation of linguistic puzzles. We focus on puzzles used in Linguistic Olympiads for high school students. We present results from a series of experiments using Large Language Models (LLMs), both with and without explicit reasoning capabilities, applying a range of prompting techniques. Automating puzzle generation—even for relatively simple puzzles—holds promise for expanding interest in linguistics and introducing the field to a broader audience. We also explore the use of LLMs for solving linguistic puzzles, analyzing their performance across various linguistic topics. We demonstrate that LLMs outperform humans on most puzzles types, except for those centered on writing systems, and for the understudied languages. This finding highlights the importance of linguistic puzzle generation as a research task: such puzzles can not only promote linguistics but also support the dissemination of knowledge about rare and understudied languages.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: benchmarking; multilingual corpora; automatic creation and evaluation of language resources; evaluation methodologies; evaluation; datasets for low resource languages; metrics;
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis
Languages Studied: Georgian, Greek, Gujarati, Spanish, and others
Submission Number: 4654
Loading