# Is this the subspace you are looking for? An Interpretability Illusion for Subspace Activation Patching

This repository contains code to reproduce results from the paper "An
Interpretability Illusion for Subspace Activation Patching".

## Indirect Object Identification (IOI) task
Description of relevant files:
- `data_utils.py`: tools for working with the IOI dataset
- `model_utils.py`: tools to intervene on transformerlens models and train DAS
- `ioi_interventions.ipynb`: notebook to train DAS and generate other directions
  of interest
- `ioi_analysis.ipynb`: notebook to analyze results of IOI interventions

## Factual recall
Description of relevant files:
- `fact_utils.py`: tools to download necessary datasets
- `fact_patching_experiments.ipynb`: code for fact patching experiments
- `fact_patching_plots.ipynb`: notebook to recreate factual recall plots from
  the paper