# Supplementary Materials 

This is the README for supplementary materials for the paper: *How Do Language Models Speak Languages? A Case Study on Unintended Code-Switching*. We have included the following files and code to provide comprehensive implementation details and facilitate reproducibility:

- **Code**: We provide complete implementation for hierarchical attribution patching and organized as follows:

   - `patching_base.py`: Base class containing core methods (data validation, activation/gradient extraction) shared by all patching variants (e.g. AttributionPatching, ActivationPatching)
   - `attribution_patching.py`: Computes neuron-to-output and neuron-to-neuron attribution scores
   - `circuit_discovery.py`: Builds circuits by iteratively adding neurons exceeding attribution thresholds
   - `metric_base.py`: Base class for different metrics;
   - `effect_metrics.py`: Implements our `AttPMetric` (extends `MetricBase` class)
   - `patch_circuit.py`: Main function to run hierarchical patching, where we pass in model information, data and hyper-parameters for our circuit discover experiments. 

- **example_neuron_list**: We provide example neuron lists of the identified neurons in our experiments. These neurons are used for suppressing experiments in Section 4.2 and precise fine-tuning experiments in Section 4.4. 

- **csw_circuit_visualization.html**: We provide the complete html file for code-switching circuit in Figure 2. We will elaborate on the information in the html file below:

   - Each rounded rectangle represents a super-neuron, with their label written on the box. If there's connection between early neuron and late neuron, their corresponding super-neuron will be connected by a gray line. The width of the line reflects the attribution between super-neurons. Also, when you hover your mouse over a super-neuron, it highlights the connected super-neurons to it in blue, so that it's clearer to understand what contributes to this super-neuron and what super-neuron is affected by this.
   ![示例图片](./figs/super-neuron.jpg)
   - Each circuit in the rectangle represents a neuron, when you hover your mouse onto the neuron, it shows the neuron profile: the layer and neuron index, neuron description, neuron projection result and top activation samples. The background blue color in each activation sample reflects the activation value of the neuron, deeper color indicates higher activation, with max activation of the sentence showed in gray at the end of each sample.
   ![示例图片](./figs/neuron-profile.jpg)
   - Smaller level number indicates closer relationship to output. Level 1 neurons are those directly connected to output logits, and level 2 neurons connect to level 1. 

