Shared Gradient Discovery and Superposition: Learning Dynamics of Generalization in LLMs

Published: 02 Mar 2026, Last Modified: 25 Apr 2026Sci4DL 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: learning dynamics, generalization, gradients, circuits, mechanisms, mechanistic interpretability, language models, llms
Abstract: We propose shared gradient discovery and superposition as a mechanism underlying generalization in LLMs, where shared gradients lead to inherently generalizing shared solutions. To validate our hypothesis, we study circuit emergence as one form of learning such generalizing solutions. We find that our hypothesis can indeed explain and shed new light on circuit emergence and generalization.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Style Files: I have used the style files.
Challenge: This submission is an entry to the science of DL improvement challenge.
Submission Number: 113
Loading