# Understanding the Prompt Sensitivity

> Prompt sensitivity, which refers to how strongly the output of a large language model (LLM) depends on the exact wording of its input prompt, raises concerns among users about the LLM's stability and reliability. In this work, we consider LLMs as multivariate functions and perform a first-order Taylor expansion, thereby analyzing the relationship between prompts, their gradients, and the logit of the model's next token. Furthermore, according to the Cauchy–Schwarz inequality, the logit difference can be upper bounded by the product of the gradient norm and the norm of the difference between the prompts' embeddings or hidden states. Our analysis allows a general interpretation of why current transformer-based autoregressive LLMs are sensitive to prompts with the same meaning. In particular, we show that LLMs do not internally cluster similar inputs like smaller neural networks do, but instead disperse them. This dispersing behavior leads to an excessively large upper bound on the logit difference between the two prompts, making it difficult to be effectively reduced to zero. In our analysis, we also show which types of meaning-preserving prompt variants are more likely to introduce prompt sensitivity risks in LLMs. Our findings provide crucial evidence for interpreting the prompt sensitivity of LLMs. Code for experiments is available in the supplementary materials.

![alt text](figures/image.png)

## 1. ResNet & CIFAR-10
The training script is as follows:
```bash
python code/cifar10_example/train.py
```
Then use [plot_figure2.ipynb](plot/cifar10_example/plot_figure2.ipynb) to plot the figure 2.

## 2. Perturbation experiment
Run the following script to perform perturbation experiments:
```bash
python code/experimental_verifications/perturbation_analysis.py \
--model_name_or_path Qwen/Qwen1.5-0.5B \
--dataset_name ARC_Challenge
```

Then use [r_loss.py](plot/experimental_verifications/r_loss.py) to plot the figure 3(a) and 3(b) and use [grads_vs_delta_z.py](plot/experimental_verifications/grads_vs_delta_z.py) to plot the figure 3(c) and 3(d).

## 3. Validate the real dataset

This script corresponds to the `pad vs. trim` part, please run the following script:
```bash
python code/experimental_verifications/pad_vs_trim.py \
--model_name_or_path Qwen/Qwen1.5-0.5B \
--dataset ARC_Challenge
```
Then use [pad_trim.ipynb](plot/experimental_verifications/pad_trim.ipynb) to plot the figure 4(a).

## 4. Validate with our templates
This part corresponds to the sections on `first vs. latter` and `fewer vs. more`. Please run the following script:
```bash
python code/experimental_verifications/first_latter_and_fewer_more.py \
--model_name_or_path Qwen/Qwen1.5-0.5B \
--dataset_name ARC_Challenge
```

Then use [misalignment.ipynb](plot/experimental_verifications/misalignment.ipynb) to plot the figure 4(b) and 4(c).

## 5. Validate the contribution of templates and questions
First, get the logits.
```bash
python code/experimental_verifications/template_vs_question.py
```
Then use the [template_vs_question.ipynb](plot/experimental_verifications/template_vs_question.ipynb) file to calculate the contribution to figure 4(d).
