Probing Memes in LLMs: A Paradigm for the Entangled Evaluation World

ICLR 2026 Conference Submission24981 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Meme, Large Language Model, Evaluation, Probe, Paradigm
Abstract: Current evaluations of large language models (LLMs) often treat datasets and models in isolation, obscuring phenomena that only emerge from their collective interaction. Items in datasets are reduced to labeled entries, disregarding the multidimensional properties they reveal when examined across model populations. Models, in turn, are summarized by overall scores such as accuracy, neglecting performance patterns that can only be captured through diverse data item interactions. To address this gap, this paper conceptualizes LLMs as composed of invisible memes, understood as cultural genes in the sense of Dawkins that function as replicating units of knowledge and behavior. Building on this perspective, the Probing Memes paradigm reconceptualizes evaluation as an entangled world of models and data. At its core lies the perception matrix, which captures interaction patterns and enables two complementary abstractions: probe properties, extending dataset characterization beyond labels, and phemotypes, revealing fine-grained capability structures of models. Applied to 9 datasets and 4,507 LLMs, Probing Memes reveals hidden capability structures and reveals phenomena invisible under traditional paradigms (e.g., elite models failing on problems that most models answer easily). This paradigm not only supports more informative, extensible, and fair benchmarks but also lays the foundation for population-based evaluation of LLMs.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Submission Number: 24981
Loading