Evaluating the World Models Used by Pretrained Learners

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, world models, transfer learning, evaluation
TL;DR: We propose and implement a framework for answering the question: what does it mean to test if a learner has a world model embodied in it?
Abstract: A common approach for assessing whether generative models develop world models is by studying the behavior of fixed models. However, many of the benefits of having a world model arise when transferring a model to new tasks (e.g. few- shot learning). In this paper, we ask: what does it mean to test if a _learner_ has a world model embodied in it? We consider a simple definition of a true world model: a mapping from inputs to states. We introduce a procedure that assesses a learner’s world model by measuring its inductive bias when transferring to new tasks. This inductive bias can be measured in two distinct dimensions: does a learner extrapolate to new data by building functions of state, and to what degree do these functions capture the full state? We use this procedure to study the degree to which pretrained models extrapolate to new tasks based on state. We find that models that perform very well on next-token prediction can extrapolate to new tasks with very little inductive bias toward state. We conclude by assessing the possibility that these models learn bundles of heuristics that enable them to perform well on next-token prediction despite preserving little of state.
Supplementary Material: zip
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10239
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview