# ManiCoG Configuration Guide

## Quick Setup

```bash
cp config.template.yaml config.yaml
# Edit config.yaml with your settings
```

## Configuration Structure

### Project Settings

```yaml
project:
  name: "experiment_name"      # Experiment identifier
  output_dir: "./outputs"      # Results directory  
  save_pipelines: true         # Save pipeline details
  log_level: "INFO"            # Logging verbosity
```

### Model Configuration

```yaml
model:
  path: "./models/TianXi_Action_Grounding_7B"  # Model path
  type: "tianxi"                # Model type
  device: "cuda"                # Device: cuda/cpu
  cuda_device: 0               # GPU ID
  precision: "bfloat16"        # Precision: bfloat16/float16/float32
  attention_implementation: "flash_attention_2"
```

**Supported Models**:
- `tianxi` - TianXi Action Grounding 7B
- `osatlas7b` - OSAtlas 7B
- `uground` - UGround

### Dataset Configuration

```yaml
data:
  screenspot_imgs: "./data/ScreenSpot-Pro/images"      
  screenspot_test: "./data/ScreenSpot-Pro/annotations"
  task: "all"              # Task selection: all/task1,task2
  language: "en"           # Language: en/cn/all
  gt_type: "positive"      # Ground truth: positive/negative/all
  inst_style: "instruction" # Style: instruction/action/description/all
```

### Pipeline Configuration

```yaml
pipeline:
  method: "reground_gpt"    # Method: baseline/reground_gpt
  
  reground:                 # ReGrounding settings
    crop_ratio: 0.2        # Crop expansion (0.1-0.5)
    mask_previous: true    # Mask previous results
  
  gpt:                     # GPT Judge settings
    api_key: "YOUR_KEY"    # API key
    base_url: "https://api.openai.com/v1"
    model: "gpt-5"   # Model name
    temperature: 0.0       # Randomness (0.0-1.0)
    max_tokens: 5000       # Max response length
    timeout: 30            # Timeout (seconds)
```

## API Configuration Examples

### OpenRouter
```yaml
gpt:
  api_key: "sk-or-v1-..."
  base_url: "https://openrouter.ai/api/v1"  
  model: "openai/gpt-5"  
```

### OpenAI Direct
```yaml
gpt:
  api_key: "sk-..."
  base_url: "https://api.openai.com/v1"
  model: "gpt-5" 
```

### Custom Endpoint
```yaml
gpt:
  api_key: "your-key"
  base_url: "https://your-api.com/v1"
  model: "your-model"
```

## Environment Variables

Create `.env` file for sensitive data:

```bash
# API Keys (choose one)
OPENROUTER_API_KEY=sk-or-...
OPENAI_API_KEY=sk-...

# Hardware
CUDA_DEVICE=0
```

Environment variables override config.yaml settings.

## Configuration Examples

### Minimal (Baseline)
```yaml
model:
  path: "./models/TianXi_Action_Grounding_7B"
  
data:
  screenspot_imgs: "./data/ScreenSpot-Pro/images"
  screenspot_test: "./data/ScreenSpot-Pro/annotations"

pipeline:
  method: "baseline"
```

### Full (ReGrounding + GPT)
```yaml
model:
  path: "./models/TianXi_Action_Grounding_7B"
  cuda_device: 0

data:
  screenspot_imgs: "./data/ScreenSpot-Pro/images"
  screenspot_test: "./data/ScreenSpot-Pro/annotations"
  task: "all"

pipeline:
  method: "reground_gpt"
  reground:
    crop_ratio: 0.2
  gpt:
    api_key: "YOUR_KEY"
    base_url: "https://openrouter.ai/api/v1"
    model: "openai/gpt-5"
```

## Validation

```bash
# Validate syntax
python -c "import yaml; yaml.safe_load(open('config.yaml')); print(' Valid YAML')"

# Test configuration
python -c "
import yaml
from transformers import AutoConfig
config = yaml.safe_load(open('config.yaml'))
AutoConfig.from_pretrained(config['model']['path'])
print(' Configuration valid')
"
```