# Attachment Style Prediction

## Prerequisites

Follow the instructions in the `synthetic_agents` project before executing the instructions below.

### Installation

Install the project with `poetry`.
```
poetry install
```

## Creating new agents and chats

To create 60 user agents, 20 per attachment style as in the paper, do:
```
make create_agents
```

Alternatively, one can create new agents with a specific attachment style with the following call 
(adjust parameters to your needs): 
```
@TOKENIZERS_PARALLELISM=0 vector_db_local_dir="../.vector_db" db_local_dir="../.relational_db" ./bin/generate_synthetic_agents --aai_interviewer_id=1 --num_agents=20 --num_life_memories=10 --prompt_template_version="1.0.0" --attachment_style="avoidant" --report_dir="../logs"
```

A CSV file for each attachment style and details about the created agents and chats will be saved 
in the `logs` folder in the attachment style project root directory. Use the chat IDs there to 
configure the file `attachment_style/asset/chats_per_attachment_style.json` appropriately before 
playing the chats as detailed next.

**Note:** 
1. In either case, make sure to provide a valid OpenAI key in the environment variable 
`open_ai_api_key` so that user profiles and life facts can be generated by GPT-4.
2. Make sure to provide a valid ID for the interviewer agent and version for the prompt template 
used by the user agent. Both of these should have been created previously by following the 
instructions the `synthetic_agents` project.

## Generating AAI chat data

To play the synthetic interviews with the agents created by the command above, do:
```
make generate_aai_data
```

Make sure the file `attachment_style/asset/chats_per_attachment_style.json` contains the IDs of 
the chats associated with the different attachment styles. You can get these IDs from the reports 
generated at the end of the agent creation process.

The synthetic data will be saved in the directory specified in the environment variable 
`datasets_dir`, which by default is a folder called `datasets` located under the attachment style 
project root directory.

**Note:** Make sure to provide a valid OpenAI key in the environment variable 
`open_ai_api_key` and a valid anthropic key in the environment variable `anthropic_api_key` so 
that the chats can be played using both models.

## Embedding exchanges

After generating synthetic data, use the command below to update the CSV files with OpenAI 
embeddings for both answers and questions generated by the agents.
```
make embed_answers
```

We use the `text-embedding-3-small` OpenAI embedding model.

**Note:** We already provide GPT4 and Claude-3 Opus synthetic data in the `datasets` folder. They 
already contain these embeddings. One only needs to execute the command in this section if they 
are generating synthetic data from scratch.

## Reproducing the results of the paper
We provide GPT-4 and Claude 3 Opus synthetic datasets. Paste the human dataset file and the full 
assessment summary of the labeled data file under the `datasets` folder and execute 
`attachment_style/notebook/experiments_for_paper.ipynb`. 