This is for Qwen3.

Difference:
- 
- Qwen3 requires transformers==4.51.1, thus modeling files and helper functions are not compatible those we used from older version (e.g. LLaMA2).
- Config files and adversarial re-training attack scripts are updated for Qwen3 only.

Usage:
- 
- Setup environment (huggingface-cli is for gated models such as llama2-7b):
```
pip install -r requirements.txt
cd modules/eval/lm-evaluation-harness
pip install -e . 
huggingface-cli login YOUR_HUGGINGFACE_TOKEN
```

Process:
-
We take llama2-7b as an example:

1. Obtain critical layer and cosine similarity (You may need to change the config_path into absolute path to make it work.)
```
# please change the path inside this config to reflect the actual path of this tellmate folder. We assume it's located under /root.
python main.py --config_path tellmate/external_code/tee_train/scripts/tee_analysis_llama2.yml
```
The critical layer result will be:
```
pruners.wanda_sp - INFO - critical layer index is: 1.
```
And the cosine similarity analysis will be:
```
pruners.wanda_sp - INFO - critical layer and cosine similarity processed.
pruners.wanda_sp - INFO - The final order of protection is: [[1, 0], [1, 0, 2], [1, 0, 2, 3], [1, 0, 2, 3, 4], [1, 0, 2, 3, 4, 5], [1, 0, 2, 3, 4, 5, 6], [1, 0, 2, 3, 4, 5, 6, 7], [1, 0, 2, 3, 4, 5, 6, 7, 8], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], [1, 0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]].
```
So the TeLLMate's analysis result will be:

TeLLMate-C: layer 1

To get TeLLMate-F, we will then need to attack these layers groups: [1, 0] -> [1, 0, 2] -> [1, 0, 2, 3] -> ...

2. Perform attack.
```
cd external_code/attack_mlp

# Modify modeling_llama.py in this folder to change the input and output index that attack module want to perform training-based attack.
# line 1038 if idx == 0: marks input from the first layer
# line 1046 if idx == 1: marks output comes from the second layer

# start attack.
# if the count of attack target layers is higher than 3, modify line 104 auto_config.num_hidden_layers = numbers for layers you need.
python train.py
```
The result will be saved as 'sub_mlp_llama2-7b_0-1.pth'.

3. Evaluate the attack
```
cd ../..

# please change the path inside this config to reflect the actual path of this tellmate folder.
# also do not forget to set the external_weight_config.path to match with your training attack result and external_weight_config.index to mark the layers you want to attack.

# by default the log path will be denseLora/baseline_uniform_wanda_TEE...
python main.py --config_path external_code/tee_train/scripts/tee_prune_analysis_llama2.yml
```
In the log path, you can find the perplexity result as "sp_XXX_ppl.pth" and a json containing the CMQA evaluation results.

You can retrieve the perplexity result using:
```
torch.load()
```
or check with the log text file to see info such as:
```
modules.eval.setup_eval - INFO - {'wikitext2': 4710.5392844805, 'ptb': 3559.05686121415}
```
4. If attack succeed, we add more layers, following the order in step 1.

5. If attack fail, the current layer group (e.g., [1, 0] for Llama2-7B) is TeLLMate-F suggested.  