Can LLMs Generate and Solve Linguistic Olympiad Puzzles?

Can LLMs Generate and Solve Linguistic Olympiad Puzzles?

ACL ARR 2025 May Submission4654 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In this paper, we introduce a novel and exciting task: the automated generation of linguistic puzzles. We focus on puzzles used in Linguistic Olympiads for high school students. We present results from a series of experiments using Large Language Models (LLMs), both with and without explicit reasoning capabilities, applying a range of prompting techniques. Automating puzzle generation—even for relatively simple puzzles—holds promise for expanding interest in linguistics and introducing the field to a broader audience. We also explore the use of LLMs for solving linguistic puzzles, analyzing their performance across various linguistic topics. We demonstrate that LLMs outperform humans on most puzzles types, except for those centered on writing systems, and for the understudied languages. This finding highlights the importance of linguistic puzzle generation as a research task: such puzzles can not only promote linguistics but also support the dissemination of knowledge about rare and understudied languages.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking; multilingual corpora; automatic creation and evaluation of language resources; evaluation methodologies; evaluation; datasets for low resource languages; metrics;

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis

Languages Studied: Georgian, Greek, Gujarati, Spanish, and others

Submission Number: 4654

Loading