Position: Principles of Animal Cognition to Improve LLM Evaluations

Published: 01 May 2025, Last Modified: 16 Aug 2025ICML 2025 Position Paper Track oralEveryoneRevisionsBibTeXCC BY-ND 4.0
TL;DR: We argue that the core principles introduced in this paper, drawn from methods in animal cognition research, can help us develop more robust evaluations for LLMs.
Abstract: It has become increasingly challenging to understand and evaluate LLM capabilities as these models exhibit a broader range of behaviors. In this position paper, we argue that LLM researchers should draw on the lessons from another field which has developed a rich set of experimental paradigms and design practices for probing the behavior of complex intelligent systems: animal cognition. We present five core principles of evaluation drawn from animal cognition research, and explain how they provide invaluable guidance for understanding LLM capabilities and behavior. We ground these principles in an empirical case study, and show how they can already provide a richer picture of one particular reasoning capability: transitive inference.
Lay Summary: As LLM exhibit a broader range of behaviors, we may often wonder what they truly understand and what they don't. In this paper, we argue that LLM researchers should draw on the lessons from another field which has developed a rich set of lessons for probing a wide variety of intelligent behavior: animal cognition. We present five core principles of evaluation drawn from animal cognition research, and explain how they provide invaluable guidance for understanding LLM capabilities and behavior. We ground these principles in an empirical case study, and show how they can already provide a richer picture of one particular reasoning capability: transitive inference.
Verify Author Names: My co-authors have confirmed that their names are spelled correctly both on OpenReview and in the camera-ready PDF. (If needed, please update ‘Preferred Name’ in OpenReview to match the PDF.)
No Additional Revisions: I understand that after the May 29 deadline, the camera-ready submission cannot be revised before the conference. I have verified with all authors that they approve of this version.
Pdf Appendices: My camera-ready PDF file contains both the main text (not exceeding the page limits) and all appendices that I wish to include. I understand that any other supplementary material (e.g., separate files previously uploaded to OpenReview) will not be visible in the PMLR proceedings.
Latest Style File: I have compiled the camera ready paper with the latest ICML2025 style files <https://media.icml.cc/Conferences/ICML2025/Styles/icml2025.zip> and the compiled PDF includes an unnumbered Impact Statement section.
Paper Verification Code: ZjFiN
Permissions Form: pdf
Primary Area: Model Understanding, Explainability, Interpretability, and Trust
Keywords: animal cognition, cognitive science
Submission Number: 258
Loading