# ManiCoG Installation Guide

## Requirements

- Python 3.8+
- CUDA 11.8+ (for GPU acceleration)
- 16GB+ GPU memory recommended

## Quick Setup

```bash
# 1. cd repository
cd ManiCoG_paper

# 2. Create conda environment
conda env create -f ManiCoG_env.yml
conda activate ManiCoG

# 3. Install package
pip install -e .

# 4. Configure
cp config.template.yaml config.yaml
# Edit config.yaml with your paths and API keys

# 5. Run
./run_experiment.sh       # Baseline
./run_reground_gpt.sh    # ReGrounding + GPT
```

## Model Setup

### TianXi Action Grounding 7B

Required directory structure:
```
models/TianXi_Action_Grounding_7B/
├── config.json
├── pytorch_model.bin (or *.safetensors)
├── tokenizer.json
├── tokenizer_config.json
└── preprocessor_config.json
```

### Alternative Models

The framework also supports:
- OSAtlas7B  
- UGround

Place models in `./models/{model_name}/`

## Dataset Setup

### ScreenSpot-Pro

Expected structure:
```
data/ScreenSpot-Pro/
├── images/
│   ├── *.png
│   └── *.jpg
└── annotations/
    └── *.json
```

## API Configuration

For ReGrounding + GPT pipeline:

**Option 1: OpenRouter**
```yaml
pipeline:
  gpt:
    api_key: "sk-or-..."
    base_url: "https://openrouter.ai/api/v1"
    model: "openai/gpt-5"
```

**Option 2: OpenAI**
```yaml
pipeline:
  gpt:
    api_key: "sk-..."
    base_url: "https://api.openai.com/v1"
    model: "gpt-5"
```

## Verification

```bash
# Test installation
python -c "from src.utils.pipeline import Pipeline; print(' Installation successful')"

# Test model loading
python -c "
from transformers import AutoConfig
AutoConfig.from_pretrained('./models/TianXi_Action_Grounding_7B')
print(' Model loadable')
"

# Test dataset
python -c "
import os
imgs = len(os.listdir('./data/ScreenSpot-Pro/images'))
anns = len([f for f in os.listdir('./data/ScreenSpot-Pro/annotations') if f.endswith('.json')])
print(f' Dataset: {imgs} images, {anns} annotations')
"
```

