BACKTRACKING MATHEMATICAL REASONING OF LANGUAGE MODELS TO THE PRETRAINING DATA

Published: 19 Mar 2024, Last Modified: 19 Mar 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: pretraining data, language modeling, math capabilities
Abstract: In this study, we identify subsets of model pretraining data that contribute to the math reasoning ability of language models and evaluate it on several mathematical tasks (e.g., addition, multiplication). We find that training on math-only data improves simple arithmetic but doesn't fully account for complex reasoning abilities, such as chain-of-thought reasoning. We also find that code data contributes to chain-of-thought reasoning while reducing arithmetic performance.
Submission Number: 55
Loading