Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

ACL ARR 2025 July Submission510 Authors

28 Jul 2025 (modified: 30 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Rote learning is a memorization technique based on repetition. It is commonly believed to hinder generalization by encouraging verbatim memorization rather than deeper understanding. This insight holds for even learning factual knowledge that inevitably requires a certain degree of memorization. In this work, we demonstrate that LLMs can be trained to generalize from rote memorized data. We introduce a two-phase “memorize-then-generalize” framework, where the model first rote memorizes factual subject-object associations using a semantically meaningless token and then learns to generalize by fine-tuning on a small set of semantically meaningful prompts. Extensive experiments over 8 LLMs show that they can reinterpret rote memorized data through the semantically meaningful prompts, as evidenced by the emergence of structured, semantically aligned latent representations between the two. This surprising finding opens the door to both effective and efficient knowledge injection and possible risks of repurposing the memorized data for malicious usage.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: generalization, continual learning, fine-tuning, knowledge inducing

Contribution Types: Model analysis & interpretability, Reproduction study, Approaches to low-resource settings, Data resources, Data analysis

Languages Studied: English, German, Spanish, Chinese, Japanese

Submission Number: 510

Loading