# Code Submission for ICML

This repository contains the code and scripts for reproducing the experiments presented in the paper. The experiments evaluate the gradient similarity between normal and negated prompts using Large Language Models (LLMs).

## Prerequisites

The code requires a Python environment with necessary dependencies. We recommend using Conda.

```bash
conda create -n negation -y python=3.9
conda activate negation
pip install -r requirements.txt
```

Ensure you have the `LAMA` library installed or accessible as configured in the scripts (this codebase appears to include a modified LAMA version).

## Usage

The main experiments are orchestrated by two shell scripts corresponding to the models evaluated: Qwen and Olmo.

### Reproducing Qwen Experiments

To run the evaluation for the Qwen model:

```bash
./run_qwen_eval.sh
```

This script will:
1.  Run the gradient evaluation in `span_logprob` modes.
2.  Save the results in `output_grads_qwen/`.
3.  Perform analysis on the computed gradients.

### Reproducing Olmo Experiments

To run the evaluation for the Olmo model:

```bash
./run_olmo_eval.sh
```

This script follows a similar pipeline to the Qwen evaluation:
1.  Run the gradient evaluation.
2.  Save the results in `output_grads_olmo/`.
3.  Perform analysis.

## Data

The experiments expect the LAMA (TREx) dataset.
Please ensure the data is located at `[PATH]/data/TREx` as configured in the scripts, or update the `DATA_DIR` variable in `run_qwen_eval.sh` and `run_olmo_eval.sh` to point to your data location.

## Acknowledgements

This codebase builds upon the [LAMA](https://github.com/facebookresearch/LAMA) library.
