Characterizing Pattern Matching and Its Limits on Compositional Task Structures

Characterizing Pattern Matching and Its Limits on Compositional Task Structures

ICLR 2026 Conference Submission16215 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: pattern matching, compositional generalization, functional equivalence, coverage, path ambiguity, mechanistic interpretability

Abstract: Despite impressive capabilities, LLMs often exhibit surface-level pattern-matching behaviors, evidenced by OOD generalization failures in compositional tasks. However, behavioral studies commonly employ task setups that allow multiple generalization sources (e.g., algebraic invariances, structural repetition), obscuring a precise and testable account of how well LLMs perform generalization through pattern matching and their limitations. To address this ambiguity, we first formalize pattern matching as functional equivalence, i.e., substituting input fragments observed to result in identical outputs in shared contexts. Then, we systematically study how decoder-only Transformer and Mamba behave in controlled tasks with compositional structures that isolate this mechanism. Our formalism yields predictive and quantitative insights: (1) Instance-wise success of pattern matching is tightly ordered by the number of contexts witnessing the relevant functional equivalence. (2) We derive and empirically confirm that the training data required for learning a two-hop structure grows at least quadratically with token-set size. The power-law scaling exponent agrees with predictions and remains stable across 20× parameter scaling and different architectures. (3) Path ambiguity is a structural barrier: when a variable influences the output via multiple paths, models fail to form unified intermediate state representations, impairing accuracy and interpretability. (4) Chain-of-Thought reduces data requirements yet does not resolve path ambiguity. Hence, we provide a predictive, falsifiable boundary for pattern matching and a foundational diagnostic for disentangling mixed generalization mechanisms.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Submission Number: 16215

Loading