# Mini-Twitter Conversation

## Project Overview

This project simulates and analyzes dynamic conversations between AI agents on specified topics. Each agent is powered by a language model and responds based on its unique demographic background and the evolving conversation history.

## Pipelines

This section provides a comprehensive overview of all data processing and evaluation pipelines in the `src/` directory.

### Pipeline Overview (reference to the paper)

The project contains several interconnected pipelines for processing conversational data, training LLMs, and evaluating model performance:

1. **Preprocessing Pipeline** (4.1 RPLA Construction Grounded in Human Data) - Data cleaning and formatting
2. **Simulation Pipeline** (4.2 Simulating Social Interactions with RPLAs) - LLM conversation generation
3. **Evaluation Pipeline** (4.3 Evaluation) - Human-LLM comparison and metrics
4. **Group-Level Evaluation Pipeline** (6 Opinion Dynamics) - Statistical analysis across groups
5. **Fine-tuning Pipeline** (Appendix L) - Model training (SFT, DPO)

Set `PYTHONPATH=/path/to/mini-twitter-llm-agent-modeling/src` before running the pipelines.

### 1. Preprocessing Pipeline

**Location**: `src/preprocessing/`
**Entry Point**: `src/preprocessing/run_pipeline.py`

#### Purpose
Processes raw conversational data into standardized CSV format for downstream analysis.

#### Key Components
- `preprocessing.py` - Main preprocessing logic
- `run_pipeline.py` - Pipeline orchestration

#### Configuration
- Input: `/data/raw_data/`
- Output: Processed CSV files
- Processes files by data prefix (filename without extension)

#### Usage
```bash
cd src/preprocessing
python run_pipeline.py
```

### 2. Simulation Pipeline

**Location**: `src/simulation/`
**Entry Point**: `src/simulation/run_pipeline.py`

#### Purpose
Generates LLM conversations using various models (OpenAI GPT, HuggingFace models) to simulate human-like discussions on controversial topics.

#### Key Components
- `simulate_conversation.py` - Core conversation simulation logic
- `run_pipeline.py` - Multi-model pipeline execution
- `preproc_warning.py` - Suppress preprocessing warnings

#### Configuration
- **Models Supported**:
  - OpenAI: `gpt-4o-mini-2024-07-18`
  - HuggingFace: Llama, Mistral, OLMo, Qwen models
  - Fine-tuned models from `/finetuned_models/`
- **Input**: `/data/processed_data/` (Pattern: `2025(03|04|05|06|07|08).*\.csv`)
- **Output**: `/result/simulation/{data_prefix}/{model_name}/simulation-{version}-ablation.csv`
- **Versions**: v0 (full conversation simulation), v1 (tweet-guided conversation simulation), v2 (next message prediction)
- **Parallelization**: N processes with model caching

#### Features
- 4-bit quantization for efficient GPU memory usage
- Flash attention for faster inference  
- Model compilation with `torch.compile`
- Multiprocessing with spawn context
- Automatic output file existence checking

#### Usage
```bash
cd src/simulation
python run_pipeline.py
```

### 3. Evaluation Pipeline

**Location**: `src/eval/`
**Entry Points**: `src/eval/validate_response.py` then `src/eval/run_pipeline.py`

#### Purpose
Compares human and LLM conversations using similarity metrics and opinion trajectory analysis.

#### Key Components
- `human_llm.py` - Human-LLM similarity scoring
- `opinion_proc.py` - Opinion trajectory processing
- `opinion_plot_human_llm.py` - Visualization
- `llm_report_aggr.py` - Aggregate reporting
- `run_pipeline.py` - Pipeline orchestration
- `validate_response.py` - Validity columns for filtering in dataloader for evaluation pipeline
- `util.py` - Functions shared by evaluation pipeline

#### Evaluation Models
- OpenAI: `gpt-4o-mini-2024-07-18`
- HuggingFace: Mistral, Llama variants
- Custom fine-tuned models

#### Output Structure
- `/result/eval/human_llm/{data_prefix}/{model_name}/`
  - `human_llm_score_{version}.csv` - Similarity scores
  - `opinion_memory_{eval_model}_{version}.csv` - Opinion trajectories
- Aggregated reports: `llm_report_{version}_{breadth|depth}.csv`

#### Features
- **Breadth vs Depth Analysis** - Categorizes topics into breadth/depth categories
- **Opinion Memory** - Tracks opinion changes across conversation rounds
- **Cross-topic Analysis** - Groups experiments by topic for comparison
- **Error Handling** - Logs processing errors and continues execution

#### Usage
```bash
cd src/eval
python validate_response.py
python run_pipeline.py
```

### 4. Group-Level Evaluation Pipeline

**Location**: `src/group_level_eval/`
**Entry Point**: `src/group_level_eval/run_pipeline.py`

#### Purpose
Performs statistical analysis and visualization of conversation dynamics across different experimental conditions.

Check `/GROUP_LEVEL_EVAL.md` for more details.

### 5. Fine-tuning Pipelines

**Location**: `src/finetuning/`

#### Purpose
Train custom models using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) techniques.

#### Workflow

```sh
# data generation
python3 partition.py
python3 format_finetune_data.py
python3 dpo_data.py

# finetuning
python3 chatgpt_finetune.py
python3 llama_finetuning.py
```


#### Key Scripts
- `chatgpt_finetune.py` - OpenAI GPT fine-tuning
- `llama_finetuning.py` - Llama model fine-tuning
- `format_finetune_data.py` - Data formatting for training
- `partition.py` - Data splitting utilities

#### Data Formats
- **Directories**: `formatted_data_breadth/`, `formatted_data_depth/`
- **Splits**: group_split, round_split, topic_split
- **Files**: train.jsonl, test.jsonl, train_valid.jsonl, test_valid.jsonl

#### Features
- **Multiple Training Methods**: SFT, DPO, ORPO, PPO
- **Efficient Training**: Unsloth integration, LoRA adapters
- **Cost Estimation on Words**: `approx_cost.py`
- **Validation**: `check_finetuning.py`

### Pipeline Dependencies

```mermaid
graph TD
    A[Raw Data] --> B[Preprocessing Pipeline]
    B --> C[Simulation Pipeline]
    C --> D[Evaluation Pipeline]
    D --> E[Group-Level Eval Pipeline]
    B --> G[Finetuning Pipeline]
    G --> C
    C --> G
```

### Common Configuration Patterns

#### File Naming Conventions
- **Data Prefixes**: `YYYYMMDD_HHMMSS_Topic_Name_UniqueID`
- **Model Names**: `provider/model-name` or `mini-twitter/custom-model-name` (for our models after finetuning)
- **Versions**: v0, v1, v2

#### Directory Structure
```
/data/
├── raw_data/           # Input CSV files
└── processed_data/     # Preprocessed files

/result/
├── simulation/         # LLM conversation outputs
├── eval/               # Evaluation metrics  
└── group_level_eval/   # Statistical analysis
```

### Setup and Usage (Local)

1. Create a conda environment:
   ```sh
   conda env create -f environment.yml
   conda activate mt
   export PIP_NO_BUILD_ISOLATION=1  # for flash-attn
   pip install flash-attn
   ```

2. Check and run pipelines as in `/PIPELINES.md`

Note: Make sure you have set up your OpenAI API key and/or HuggingFace API key either as environment variables `OPENAI_API_KEY` and `HUGGINGFACEHUB_API_TOKEN`, or in file(s) named `openai-key.txt` and `huggingface-key.txt` in the directory of the script you would like to run that requires the API key.

If you have an obsolete `glibc` and do not have root permission, you can override with `nix` under user space as:

```nix
# Run with `nix-shell mt.nix`
let pkgs = import (builtins.fetchTarball {
    url = "XXXX-25.05-darwin.tar.gz";
  }) { config = { allowUnfree = true; cudaSupport = true; }; };
in
pkgs.mkShell {
   name = "mt";
   buildInputs = with pkgs; [
      bashInteractive
      glibc
      mamba-cpp
   ];
}
```

```sh
export NP_LOCATION="/path-to-nix-portable-root"
export NP_RUNTIME="bwrap"
./nix-shell mt.nix

# After entering the nix shell, you can run the following commands:
export MAMBA_ROOT_PREFIX=~/path-to-mamba
mamba env create -f environment.yml
mamba activate mt
export PIP_NO_BUILD_ISOLATION=1  # for flash-attn
pip install flash-attn
```

### Requirements

- Python 3.8+
- pandas
- numpy
- langchain-core
- langchain-openai
- langchain-huggingface

See `requirements.txt` for detailed dependencies.

### Configuration

- Prompt templates: Customize agent behavior by editing files in the `prompts/` directory.

### Generated Files

- Individual conversation files: For each agent, a text file (e.g., `agent0_history_1.txt`, `agent1_history_1.txt`) is generated. These files contain the full conversation history for each agent.

- Conversation log: A text file `conversation_log.txt` is generated, containing the full conversation history.

- JSON summary: A JSON file `simulation.log` is created, providing a structured overview of the entire conversation. This includes metadata such as the topic, number of turns, model.

- Output Conversation File: A csv file `simulation.csv` is generated containing a log of the outputs generated by the agents in the `llm_text_v2` column.

## Project Structure

```bash
mini-twitter-llm-agent-modeling/
├── data/
│   ├── processed_data/
│   │   └── {DATA_PREFIX}.csv
│   ├──  raw_data/
│   │     └── {DATA_PREFIX}.csv
│   ├──  augmented_data/
│   │     └── {DATA_PREFIX}.csv
│   └── annotated_data/
│       └── {DATA_PREFIX}.csv
├── prompts/
│   ├── base_models/
│   │   └── step2_generate_response.md
│   ├── fine_tune_v1/
│   │   ├── add_to_memory_read.md
│   │   ├── add_tweet.md
│   │   ├── generate_message.md
│   │   └── ...
│   ├── chat_templates/
│   │   ├── Meta-Llama-3.1-8B.txt
│   │   ├── Mistral-7B-Instruct-v0.3.txt
│   │   ├── Mistral-7B-v0.3.txt
│   │   └── ...
│   ├── simulation_v0 (full simulation)/
│   │   └── prompt_v5/
│   │       ├── step0_demographics.md
│   │       ├── step1_persona.md
│   │       ├── step2_generate_tweet.md
│   │       └── ...
│   ├── simulation_v1 (tweet-guided simulation)/
│   │   └── prompt_v1/
│   │       ├── step0_demographics.md
│   │       ├── step1_persona.md
│   │       ├── step2_add_to_memory_tweet.md
│   │       └── ...
│   └── simulation_v2 (next message prediction)/
│       ├── prompt_v1/
│       │   ├── step0_demographics.md
│       │   ├── step1_persona.md
│       │   ├── step2_add_to_memory_tweet.md
│       │   └── ...
│       └── prompt_v2/
│           ├── step0_demographics.md
│           ├── step1_persona.md
│           ├── step2_add_to_memory_tweet.md
│           └── ...
└── src/
    ├── eval/
    ├── finetuning/
    ├── preprocessing/
    ├── simulation/
    └── .../
```

## Notes

- Always activate the virtual environment before running the script to ensure you're using the correct dependencies. Set the `PYTHONPATH` environment variable as well to `/path/to/src`.
- The language model is specified by the `--model` argument in `simulate_conversation.py`.
- The generated text files and JSON summary allow for easy review and analysis of the simulated conversations.

