Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities

ACL ARR 2024 December Submission246 Authors

12 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In this work, we propose a simple theoretical framework, Pelican Soup, aiming to better understand how pretraining allows LLMs to (1) generalize to unseen instructions and (2) perform in-context learning, even when the verbalizers are irrelevant to the task. To this end, in our framework, we introduce the notion of "knowledge base" and "reference-sense association" and a simple formalism for natural language processing tasks. Our framework demonstrates how linguistic, psychology, and philosophy studies can inform our understanding of the language model and is connected to several other existing theoretical results. As an illustration of the usage of our framework, we derive a bound on in-context learning loss with our framework. Finally, we support our framework with empirical experiments and provide possible future research directions.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: theory,self-supervised learning,generative models,generalization,few-shot learning

Contribution Types: Position papers, Theory

Languages Studied: English

Submission Number: 246

Loading