Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small

Published: 01 Jan 2023, Last Modified: 30 Sept 2024ICLR 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading