Repeated examples help learn arithmetic

Francois Charton; Julia Kempe

Repeated examples help learn arithmetic

Francois Charton, Julia Kempe

Published: 10 Oct 2024, Last Modified: 31 Oct 2024MATH-AI 24EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Transformers, arithmetic, learning

TL;DR: On two arithmetic tasks, GCD and modular multiplication, models trained on small sets of repeated examples outperform models trained from larger, single use, sets

Abstract: We study small transformers trained on two problems of arithmetic: the greatest common divisor (GCD) and modular multiplication, and show that models trained on a limited set of repeated examples achieve better performance than models trained from unlimited data. In fact, modular multiplication is only learned on small training sets. We also demonstrate that two-set training - repeated use of a small random subset of examples, along normal sampling on the rest of the training set - provides for faster learning and better performance. These experiments highlight that the benefits of repetition can outweigh those of data diversity; and shed light on the still poorly understood interplay between generalization and memorization in deep learning.

Concurrent Submissions: Longer version submitted to ICLR 2025

Submission Number: 77

Loading