# Towards Atoms of Large Language Models

This is the relevant code for the paper, ***Towards Atoms of Large Language Models***.



## Requirements

```bash
conda create -n towards_atoms python=3.10
```

```bash
pip install -r requirements.txt
```



## Representations Shifting

First enter the directory `Representations_Shifting`,

```bash
cd Representations_Shifting
```

Here we use `gpt2-small` as an example, enabling the experiments to be completed within a few hours.  In contrast, using the largest `meta-llama/Llama-2-13b-hf` model would require roughly a week on an 80 GB A100 GPU to obtain comparable results, whereas `gpt2-small` produces complete results in approximately 2–3 hours on a 24 GB RTX 3090 GPU.  To use a different model, simply adjust the `--model_name` argument in the command below.  The supported options for `--model_name` are:

- `gpt2-small`, `gpt2-medium`, `gpt2-large`, `gpt2-xl`;
- `EleutherAI/gpt-j-6B`;
- `EleutherAI/pythia-1b`, `EleutherAI/pythia-1.4b`, `EleutherAI/pythia-2.8b`, `EleutherAI/pythia-6.9b`;
- `meta-llama/Llama-2-7b-hf`, `meta-llama/Llama-2-13b-hf`, `meta-llama/Meta-Llama-3-8B`, `meta-llama/Meta-Llama-3.1-8B`;
- `google/gemma-2-2b`, `google/gemma-2-9b`.

### 1. Under the Euclidean Inner Product (EIP)

Compute the cosine similarities between sample activations in each layer of the language model and store the results in the `data/activation_cos_EIP` directory.

```bash
python3 Compute_Cos_EIP.py --model_name=gpt2-small
```

### 2. Under the Atomic Inner Product (AIP)

First, we estimate the explicit form of the atomic inner product corresponding to the model activations, using the following code.

```bash
python3 Cache_Cov.py --model_name=gpt2-small
```

By default, the Wikipedia dataset `20220301.en` is used for this estimation, and the resulting statistics are saved in the `data/stats` directory. 

Subsequently, cosine similarities between sample activations of each model layer are computed under the atomic inner product and stored in the `data/activation_cos_AIP` directory.

```bash
python3 Compute_Cos_AIP.py --model_name=gpt2-small
```

### 3. Experimental Results

The complete evaluation procedure and results are provided in `notebooks/Experimental_Results.ipynb`.



## Knowledge Atomization (Training)

First enter the directory `Knowledge_Atomization`,

```bash
cd Knowledge_Atomization
```

Here we use `google/gemma-2-2b` as an example, which allows the full pipeline to run relatively quickly on a 24 GB RTX 3090 GPU. Running the same process with `google/gemma-2-9b` or `meta-llama/Meta-Llama-3.1-8B` requires substantially more memory—peaking around 40 GB—but all experiments remain feasible on an 80 GB A100 GPU.

### 1. Data Collection

First, ensure that the `data/` directory contains the `counterfact.json` file. Then

```bash
python3 Data_Collection.py --model_name=google/gemma-2-2b --dataset=Counterfact
```

This step retrieves the activations of all entities in the CounterFact dataset across all layers of the specified language model and stores them in the `data/Representation` directory.

### 2. Train SAEs

```bash
python3 Train_SAEs.py --model_name=google/gemma-2-2b --layer=0 --scaling_factor=4
```

Here, the `--layer` option specifies the layer from which activations are used for training, and `--scaling_factor` sets the expansion factor relative to the activation dimensionality—that is, the SAE width equals the activation dimension multiplied by this factor. The trained model is then saved in the `saved_models/` directory.

### 3. Evaluations

The complete evaluation procedure and results are provided in the notebook `notebooks/Evaluations.ipynb`.