Exploring Explainable Compositionality of LLMs: A Program-Generation Perspective

Exploring Explainable Compositionality of LLMs: A Program-Generation Perspective

ACL ARR 2025 May Submission156 Authors

08 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Compositional generalization tests are often used to estimate the compositionality of LLMs. However, compositional generalization tests (1) do not focus on the explanations of LLMs for their fitted functions and (2) use consistency with a fixed function on a pre-partitioned test set as a criterion, hindering the acquisition of explainable and convincing estimation and analysis of the compositionality of LLMs. In this work, we propose a program-generation perspective that takes the programs generated by LLMs as externalized explanations and provides estimates of the compositionality of LLMs with the help of complexity-based theory. The perspective addresses the explainability limitations of compositional generalization tests and provides a new way to analyze the compositionality characterization of LLMs. We conduct experiments and analysis of existing advanced LLMs based on this perspective on a string-to-grid task, and find various compositionality characterizations and compositionality deficiencies exhibited by LLMs.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: compositionality

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 156

Loading