Two Stories in Understanding Arithmetic Learning at Scale

28 Sept 2024 (modified: 12 Oct 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Math Arithmetic, Mechanistic Understanding, Large Language Models
TL;DR: We find that language models perform arithmetic learning in a pure symbolic way, and we further provide a holistic framework to understand how.
Abstract: Large Language Models (LLMs) have long been suspected of struggling with arithmetic learning due to the inherent differences between language modeling and numerical calculation, but concrete evidence for this claim has been lacking. In this work, we respond to this claim through a two-side experiment. First, we investigate whether LLMs leverage partial products during arithmetic learning. We find that although LLMs can identify some partial products after learning, they fail to leverage them during the learning process. Instead, these partial products appear to be by-products from the pattern fitting. Second, we examine whether LLMs treat arithmetic in a purely symbolic manner. We decompose the task into subgroup-level (pairs of subportion tokens), hypothesizing that arithmetic learning difficulty arises from subgroup complexity, subgroup selection, and sequential prediction. We find that, given controlled subgroup complexity, LLMs treat a collection of different arithmetic operations similarly. Furthermore, arithmetic with low entropy in the subgroup label space lends to be more learnable. By analyzing position-level accuracy across different training sizes, we further explore subgroup selection. We observe that position-level accuracy follows a U-shaped pattern: LLMs quickly learn the easiest patterns at the first and last positions, while progressively learning the more difficult patterns in the middle positions. This phenomenon suggests that LLMs follow an easy-to-hard learning mechanism based on subgroup complexity. Our work confirms that LLMs are pure symbolic learners in arithmetic tasks and underscores the importance of understanding them deeply through subgroup-level quantification.
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13325
Loading