# 📝 Method and Model Configs 📝

In this doc file, we describe the method and model configs in the `configs` folder. The pipeline configs are described in the [evaluation pipeline docs](./evaluation_pipeline.md)

## Model Configs
Model configs are used for loading target models and their tokenizers. They are also used for dynamic experiment configs (described below). A subset of the `configs/model_configs/models.yaml` file is shown below.

```yaml
llama2_7b:
  model:
    model_name_or_path: meta-llama/Llama-2-7b-chat-hf
    use_fast_tokenizer: False
    dtype: float16
  num_gpus: 1
  model_type: open_source

qwen_7b_chat:
  model:
    model_name_or_path: Qwen/Qwen-7B-Chat
    use_fast_tokenizer: True
    dtype: bfloat16
    pad_token: <|extra_0|>
    eos_token: <|im_end|>
    trust_remote_code: True
  num_gpus: 1
  model_type: open_source

gpt-3.5-turbo-1106:
  model:
    model_name_or_path: gpt-3.5-turbo-1106
    token: token
  model_type: closed_source

qwen_vl_chat:
  model:
    model_name_or_path: Qwen_VL_Chat
  model_type: open_source_multimodal
```

The model names shown above are `llama2_7b`, `qwen_7b_chat`, `gpt-3.5-turbo-1106`, and `qwen_vl_chat`. When a model name from this file is specified to `generate_completions.py`, the corresponding model config dictionary is loaded and used for generating completions.

#### Note for adding new models
We manually check for how to handle the specific model config dictionary in `load_generation_function`. New types of models (e.g., new APIs) can be incorporated here and in `api_models.py` and `eval_utils.py`. No modification of these files is required for transformers-compatible models. The configs for two transformers-compatible models are shown above: Llama 2 7B Chat and Qwen-7B-Chat. The `model` field of their config dictionaries is passed to `load_model_and_tokenizer` in `baselines/model_utils.py`. Other fields are used in other parts of the evaluation pipeline (e.g., `num_gpus` is used to initialize vLLM models).

## Method Configs
Method config files are used solely for initializing classes that define a red teaming method. Each red teaming method/attack has a corresponding class that determines its name, e.g., `GCG`. The corresponding method config is assumed to be in `configs/method_configs/{method_name}_config.yaml` by default. For example, the GCG config is in `configs/method_configs/GCG_config.yaml`. A modified version of this GCG config file is shown below.

```yaml
default_method_hyperparameters:
  num_test_cases_per_behavior: 1
  num_steps: 500
  adv_string_init : "! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !"
  allow_non_ascii: False
  search_width: 512 # Number of candidates searched at each step
  use_prefix_cache: True
  targets_path: ./data/optimizer_targets/harmbench_targets_text.json

baichuan2_13b_1000steps:
  target_model:
    model_name_or_path: baichuan-inc/Baichuan2-13B-Chat
    device_map: auto
    dtype: bf16
    trust_remote_code: True
    num_steps: 1000
  use_prefix_cache: False  # Baichuan 2 expects KV cache to only be used during generation  

<model_name1>:
  target_model: <model_name1>['model']
```

In this config file, we have three entries:
1. The default hyperparameters
2. A manual experiment config for running GCG on Baichuan 2 13B for 1000 steps
3. A dynamic experiment config that references values in `configs/model_configs/models.yaml`

The `default_method_hyperparameters` entry is required in all configs and specifies the default hyperparameters that every GCG experiment is initialized with. Let's take a look at how this interacts with the `baichuan2_13b_1000steps` experiment. If the experiment name passed to `generate_test_cases.py` is `baichuan2_13b_1000steps`, then GCG will be initialized with the following hyperparameters:

```yaml
num_test_cases_per_behavior: 1
num_steps: 1000
adv_string_init : "! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !"
allow_non_ascii: False
search_width: 512
use_prefix_cache: False
targets_path: ./data/optimizer_targets/harmbench_targets_text.json
target_model:
  model_name_or_path: baichuan-inc/Baichuan2-13B-Chat
  device_map: auto
  dtype: bf16
  trust_remote_code: True
```

Note how the `use_prefix_cache` and `num_steps` default hyperparameters were overridden by the values specified in the experiment config. Now suppose we use the experiment name `llama2_7b`. There is no manually defined experiment config named `llama2_7b`, but there is a dynamic experiment config with the name `<model_name1>`. This is parsed by our dynamic config parsing code in `eval_utils.py`, and effectively allows referencing values in any model config in `configs/model_configs/models.yaml`. We describe dynamic experiment configs in more detail below.

## Dynamic Experiment Configs
Any time an experiment name in a method config file includes `<model_name#>`, it it converted into a dynamic experiment config. This means we implicitly expand out that experiment name into one name per entry in the `configs/model_configs/models.yaml` file. The parameters in these experiment configs can have the form `<model_name#>['<key1>'][index1]['<key2>']...` to reference values in the corresponding model config. For example, suppose we have the following model config

```yaml
llama2_13b:
  model:
    model_name_or_path: meta-llama/Llama-2-13b-chat-hf
    use_fast_tokenizer: False
    dtype: float16
  num_gpus: 1

another_model:
  different_format:
    model_name_or_path: meta-llama/Llama-2-13b-chat-hf
```

and the following entries in our method config file

```yaml
default_method_hyperparameters:
  num_steps: 50

<model_name1>:
  target_model:
    model_name_or_path: <model_name1>['model']['model_name_or_path']
```

Now if we pass the experiment name `llama2_13b` to `generate_test_cases.py` with the above method, the resulting method config dictionary will be

```yaml
num_steps: 50
target_model:
  model_name_or_path: meta-llama/Llama-2-13b-chat-hf
```

Note that `another_model` doesn't have the specified fields in the dynamic config. If we try to pass this as the experiment name for this method, we will get an error. However, this error will only occur if we try to load `another_model`. It will not occur if we load the `llama2_7b` experiment. This is what we mean by "implicitly expand out that experiment name into one name per entry". Additionally, if an expanded experiment name would have the same name as a manually added experiment, the manually added experiment config will be used instead and no attempt will be made to index into entries in `models.yaml`.

### Advanced Usage

There can be any number of model names specified in a dynamic experiment config entry, and they can appear amid other text that specifies the unique experiment. For example, if our method config file is

```yaml
default_method_hyperparameters:
  num_steps: 50

ensemble_optimization_<model_name1>_<model_name2>:
  target_models:
    - <model_name1>['model']['model_name_or_path']
    - <model_name2>['model']['model_name_or_path']

ensemble_optimization_<model_name1>_<model_name2>_1000steps:
  target_models:
    - <model_name1>['model']['model_name_or_path']
    - <model_name2>['model']['model_name_or_path']
  num_steps: 1000
```

and we pass the experiment name `ensemble_optimization_llama2_7b_llama2_13b` to `generate_test_cases.py`, the resulting method config will be

```yaml
num_steps: 50
target_models:
  - meta-llama/Llama-2-7b-chat-hf
  - meta-llama/Llama-2-13b-chat-hf
```

If we pass the experiment name `ensemble_optimization_llama2_7b_llama2_13b_1000steps`, the resulting method config will be

```yaml
num_steps: 1000
target_models:
  - meta-llama/Llama-2-7b-chat-hf
  - meta-llama/Llama-2-13b-chat-hf
```

### Referencing Model Configs

In addition to referencing values in `models.yaml` in dynamic experiment configs, we can also reference values in `models.yaml` in manually defined experiment configs, if the name of the experiment config contains model names. Let's consider the modified GCG config from before:

```yaml
default_method_hyperparameters:
  num_test_cases_per_behavior: 1
  num_steps: 500
  adv_string_init : "! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !"
  allow_non_ascii: False
  search_width: 512 # Number of candidates searched at each step
  use_prefix_cache: True
  targets_path: ./data/optimizer_targets/harmbench_targets_text.json

baichuan2_13b_1000steps:
  target_model:
    model_name_or_path: baichuan-inc/Baichuan2-13B-Chat
    device_map: auto
    dtype: bf16
    trust_remote_code: True
    num_steps: 1000
  use_prefix_cache: False  # Baichuan 2 expects KV cache to only be used during generation 
```

Here, the `baichuan2_13b_1000steps` experiment is manually defined, and the experiment name contains the `baichuan2_13b` model name from `models.yaml`. The experiment config contains information that is duplicated in the `models.yaml` file. To avoid maintaining multiple copies of model paths and hyperparameters across different methods, we can reference values in `models.yaml` within the `baichuan2_13b_1000steps` experiment as follows:

```yaml
baichuan2_13b:
  target_model:
    model_name_or_path: <model_name1>['model']['model_name_or_path']
    device_map: auto
    dtype: bf16
    trust_remote_code: True
    num_steps: 1000
  use_prefix_cache: False  # Baichuan 2 expects KV cache to only be used during generation 
```

Here, since the experiment name contains a model name, we can reference parameters from the corresponding model config in `models.yaml` just as we would in a dynamic experiment config (described in earlier sections). As with dynamic experiment configs, we can also reference values from multiple experiment configs. For example, if we define a manual experiment with the name `ensemble_optimization_llama2_7b_llama2_13b`, we can reference values in the `llama2_7b` model config using `<model_name1>` and we can reference values in the `llama2_13b` model config using `<model_name2>`:

```yaml
ensemble_optimization_llama2_7b_llama2_13b:
  target_models:
    - <model_name1>['model']['model_name_or_path']
    - <model_name2>['model']['model_name_or_path']
```

This functionality is documented further in `get_experiment_config` and `replace_values_recursive` in `eval_utils.py`.