Identifying a Circuit for Verb Conjugation in GPT-2

Published: 2025, Last Modified: 11 Sept 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: I implement a procedure to isolate and interpret the sub-network (or "circuit") responsible for subject-verb agreement in GPT-2 Small. In this study, the model is given prompts where the subject is either singular (e.g. "Alice") or plural (e.g. "Alice and Bob"), and the task is to correctly predict the appropriate verb form ("walks" for singular subjects, "walk" for plural subjects). Using a series of techniques-including performance verification automatic circuit discovery via direct path patching, and direct logit attribution- I isolate a candidate circuit that contributes significantly to the model's correct verb conjugation. The results suggest that only a small fraction of the network's component-token pairs is needed to achieve near-model performance on the base task but substantially more for more complex settings.
Loading