Comparing the Evaluation and Production of Loophole Behavior in Children and Large Language Models

Published: 20 Jun 2023, Last Modified: 29 Jun 2023ToM 2023EveryoneRevisionsBibTeX
Keywords: theory of mind, pragmatics, social reasoning, loopholes, large-language models, artificial intelligence
Abstract: In law, lore, and everyday life, loopholes are commonplace. When people exploit a loophole, they understand the intended meaning or goal of another, but choose to go with a different, though still possible interpretation. Previous work suggests people exploit loopholes when their goals are misaligned with the goals of others, but both capitulation and disobedience are too costly. Past and current AI research has shown that artificial intelligence engages in what seems superficially like the exploitation of loopholes. However, this is an anthropomorphization. It remains unclear to what extent current models, especially Large Language Models (LLMs), capture the pragmatic understanding required for engaging in loopholes. We examine the performance of LLMs on two metrics developed for studying loophole behavior in adults and children: evaluation (are loopholes rated as resulting in differential trouble compared to compliance and non-compliance), and generation (coming up with new loopholes in a given context). We conduct a fine-grained comparison of state-of-the-art LLMs to children, and find that while some LLMs rate loophole behaviors as resulting in less trouble than outright non-compliance (in line with children), they struggle to generate loopholes of their own. Our results suggest a separation between the faculties underlying the evaluation and generation of loophole behavior, in both children and LLMs, with LLM abilities dovetailing with those of the youngest children in our studies.
Submission Number: 16
Loading