Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes

Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes

ACL ARR 2025 May Submission4153 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Humour, as an often esoteric language form, is derived from myriad aspects of life, whilst existing work on computational humour has focussed almost exclusively on short pun-based jokes. In this work, we investigate if the ability of Large Language Models (LLMs) to explain humour depends on the particular humour form. Specifically, we compare models on simple puns and more complex topical humour that requires knowledge of real-world entities and events. In doing so, we curate a dataset of 600 jokes split across 4 joke types and manually write high-quality explanations. These jokes include heterographic and homographic puns, as well as contemporary internet humour and topical jokes, where understanding relies on reasoning beyond "common sense", rooted instead in world knowledge regarding news events, politics, pop culture, and more. Using this dataset, we compare the zero-shot abilities of a range of LLMs to accurately and comprehensively explain jokes of different types, identifying key research gaps in the task of humour explanation. We find that none of the tested models (inc. reasoning models) are capable of reliably generating adequate explanations of all joke types, further highlighting the narrow focus of most works in computational humour on overly simple joke forms.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Data resources, Data analysis

Contribution Types: Data resources, Data analysis

Languages Studied: English

Submission Number: 4153

Loading