# Multi-Agent Maze Runner - Batch & Debug Mode Documentation

## Overview

The Multi-Agent Maze Runner includes powerful batch processing capabilities that allow you to run large-scale experiments across different maze configurations and parameter combinations. This documentation covers both **Batch Mode** for parallel execution and **Debug Mode** for detailed debugging and development.

## Table of Contents

1. [Quick Start](#quick-start)
2. [Batch Mode](#batch-mode)
3. [Debug Mode](#debug-mode)
4. [Configuration Management](#configuration-management)
5. [Results and Output](#results-and-output)
6. [Examples](#examples)
7. [Troubleshooting](#troubleshooting)
8. [Advanced Usage](#advanced-usage)

## Quick Start

### Prerequisites

Ensure you have the virtual environment activated:

```bash
cd /myenv/multi-agent-maze-runner
source .venv/bin/activate
```

### Running Your First Batch

```bash
# Navigate to the workflow directory
cd orchestrator_maze_implementation

# Generate configuration files (if not already done)
python generate_configs.py

# Run batch mode with default settings
python batch_runner.py

# Run debug mode on a specific configuration
python batch_runner.py --debug --debug_config config_maze_05441a190ea44838_temp_0_0.yaml --config_dir test_debug
```

## Batch Mode

Batch mode allows you to execute multiple configurations in parallel, making it ideal for:

- Large-scale experiments across different mazes
- Parameter sweeps (temperature variations)
- Performance benchmarking
- Reproducibility studies

### Basic Usage

```bash
python batch_runner.py [OPTIONS]
```

### Command Line Options

| Option         | Short | Default             | Description                                        |
| -------------- | ----- | ------------------- | -------------------------------------------------- |
| `--config_dir` | `-c`  | `generated_configs` | Directory containing configuration files           |
| `--parallel`   | `-p`  | `2`                 | Number of parallel executions                      |
| `--timeout`    | `-t`  | `900`               | Timeout in seconds for each execution (15 minutes) |
| `--output_dir` | `-o`  | `batch_results`     | Directory to save results                          |
| `--resume`     | `-r`  | `False`             | Resume from where previous run left off            |

### Examples

```bash
# Run with custom parallelism and timeout
python batch_runner.py --parallel 4 --timeout 600

# Run with custom directories
python batch_runner.py --config_dir my_configs --output_dir my_results

# Resume a previous batch run
python batch_runner.py --resume

# High-performance batch run
python batch_runner.py --parallel 8 --timeout 300 --output_dir fast_results
```

### Batch Mode Features

- **Parallel Processing**: Uses `ProcessPoolExecutor` for true parallel execution
- **Timeout Management**: Prevents hanging executions with configurable timeouts
- **Progress Tracking**: Real-time progress updates and logging
- **Resume Capability**: Continue from where previous runs left off
- **Result Persistence**: Incremental saving of results
- **Error Handling**: Graceful handling of failed configurations

## Debug Mode

Debug mode runs a single configuration in the same process, making it ideal for:

- Development and debugging
- Detailed error investigation
- Step-by-step execution analysis
- IDE integration and breakpoint debugging

### Basic Usage

```bash
python batch_runner.py --debug [OPTIONS]
```

### Debug Mode Options

| Option           | Description                            |
| ---------------- | -------------------------------------- |
| `--debug`        | Enable debug mode                      |
| `--debug_config` | Specific config file to run (optional) |
| `--config_dir`   | Directory containing debug configs     |
| `--output_dir`   | Output directory for debug results     |

### Examples

```bash
# Run debug mode with first available config
python batch_runner.py --debug

# Debug a specific configuration
python batch_runner.py --debug --debug_config config_maze_05441a190ea44838_temp_0_0.yaml

# Debug with custom directories
python batch_runner.py --debug --config_dir test_debug --output_dir debug_results

# Debug specific maze configuration
python batch_runner.py --debug --debug_config config_maze_1fbad51aa9164fa7_temp_0_1.yaml --config_dir generated_configs
```

### Debug Mode Features

- **Same Process Execution**: No subprocess isolation for easier debugging
- **Full Traceback Capture**: Detailed error information saved to files
- **IDE Integration**: Compatible with debuggers and IDEs
- **Immediate Feedback**: Real-time execution without timeout constraints
- **Development Friendly**: Easier to add breakpoints and inspect variables

## Configuration Management

### Generating Configurations

Use the configuration generator to create experiment matrices:

```bash
python generate_configs.py
```

This generates configurations for all combinations of:

- **20 different mazes** (various sizes and complexities)
- **6 temperature values** (0.0, 0.1, 0.3, 0.5, 0.7, 0.9)

Total: **120 configurations** for comprehensive experiments.

### Configuration Structure

Each configuration file contains:

```yaml
# Generated configuration file
# Maze ID: 05441a190ea44838
# Temperature: 0.0

setup:
  recursion_limit: 50000
  num_turn_iterations: 1
  env_input: true

maze:
  uuid: 05441a190ea44838

visualization:
  figure_size: [16, 8]
  final_display_duration: 5
  message_history_limit: 8
  enable_maze_visualization: false

agents:
  global:
    num_execution_agents: 2
    steps_per_agent: 5
  execution:
    model: gpt-4.1-nano
    temperature: 0.0
  planning:
    model: gpt-4.1-nano
    temperature: 0.0
  orchestration:
    model: gpt-4.1-nano
    temperature: 0.0
```

### Custom Configurations

You can create custom configurations by:

1. **Copying base config**: Use `config.yaml` as a template
2. **Modifying parameters**: Change maze UUIDs, temperatures, agent settings
3. **Organizing directories**: Place in appropriate config directories

### Available Mazes

The system includes 20 pre-built mazes of varying complexity:

- **Small mazes**: `05441a190ea44838`, `67ee1ca7f19c4c52`, `8d7dfe85c9f6446b`, etc.
- **Medium mazes**: `1220ba411466423d`, `1c6cea21f8a8487b`, `1fbad51aa9164fa7`, etc.
- **Large mazes**: `39eae295be8d43a2`, `5be500296a4546fc`, `8cd259c886874925`, etc.

## Results and Output

### Directory Structure

```
batch_results/                          # Default output directory
├── execution_results.json             # Summary of all executions
├── batch_runner.log                   # Detailed execution log
└── config_maze_[ID]_temp_[TEMP]/      # Individual config results
    ├── config.yaml                    # Copy of configuration used
    ├── result.json                    # Execution result metadata
    ├── stdout.txt                     # Standard output capture
    ├── stderr.txt                     # Error output capture
    └── traceback.txt                  # Full traceback (debug mode only)
```

### Result Metadata

Each `result.json` contains:

```json
{
  "config_file": "config_maze_05441a190ea44838_temp_0_0.yaml",
  "config_path": "/path/to/config.yaml",
  "output_dir": "/path/to/results",
  "start_time": "2024-01-15T10:30:00",
  "status": "completed",
  "duration": 245.7,
  "return_code": 0,
  "mode": "batch"
}
```

### Status Values

- `completed`: Successful execution
- `failed`: Execution failed with error
- `timeout`: Execution exceeded timeout limit
- `error`: Setup or system error

### Execution Summary

After batch completion, you'll see a summary like:

```
==============================================================
BATCH EXECUTION SUMMARY
==============================================================
Total configurations: 120
✅ Completed successfully: 115
❌ Failed: 3
⏰ Timed out: 2
💥 Errors: 0
Total execution time: 3847.3s
Average time per config: 32.1s
Results directory: batch_results
==============================================================
```

## Examples

### Example 1: Quick Parameter Sweep

```bash
# Generate configs for temperature sweep
python generate_configs.py

# Run batch with moderate parallelism
python batch_runner.py --parallel 3 --timeout 600
```

### Example 2: Debug Specific Issue

```bash
# Debug a failing configuration
python batch_runner.py --debug --debug_config config_maze_39eae295be8d43a2_temp_0_9.yaml

# Check the traceback
cat debug_results/config_maze_39eae295be8d43a2_temp_0_9/traceback.txt
```

### Example 3: High-Performance Batch

```bash
# Maximum parallelism for cluster execution
python batch_runner.py --parallel 16 --timeout 300 --output_dir cluster_results
```

### Example 4: Resume Interrupted Batch

```bash
# Resume from previous run
python batch_runner.py --resume --parallel 4
```

### Example 5: Custom Configuration Testing

```bash
# Create test configs in custom directory
mkdir test_configs
cp config.yaml test_configs/test_config_1.yaml

# Run batch on custom configs
python batch_runner.py --config_dir test_configs --output_dir test_results
```

## Troubleshooting

### Common Issues

#### 1. Missing Dependencies

**Problem**: `ModuleNotFoundError: No module named 'numpy'`

**Solution**:

```bash
# Ensure virtual environment is activated
source .venv/bin/activate

# Install missing dependencies
pip install numpy matplotlib
```

#### 2. Configuration Not Found

**Problem**: `Debug config file 'config.yaml' not found`

**Solution**:

```bash
# List available configs
ls generated_configs/

# Use exact filename
python batch_runner.py --debug --debug_config config_maze_05441a190ea44838_temp_0_0.yaml
```

#### 3. Permission Errors

**Problem**: Cannot create output directories

**Solution**:

```bash
# Check permissions
ls -la batch_results/

# Create directory manually if needed
mkdir -p batch_results
```

#### 4. Timeout Issues

**Problem**: Configurations timing out frequently

**Solutions**:

```bash
# Increase timeout
python batch_runner.py --timeout 1800

# Use debug mode for investigation
python batch_runner.py --debug --debug_config [failing_config]
```

#### 5. Memory Issues

**Problem**: Out of memory errors with high parallelism

**Solution**:

```bash
# Reduce parallelism
python batch_runner.py --parallel 2

# Monitor with system tools
htop  # or top on macOS
```

### Debugging Workflow

1. **Start with debug mode**: Always debug individual configs first
2. **Check logs**: Review `batch_runner.log` for patterns
3. **Analyze results**: Use `execution_results.json` for overview
4. **Isolate issues**: Run problematic configs individually
5. **Adjust parameters**: Modify timeout, parallelism as needed

## Advanced Usage

### Custom Configuration Generation

Create your own config generator:

```python
import yaml
from pathlib import Path

def create_custom_config(maze_id, custom_params):
    base_config = {
        'maze': {'uuid': maze_id},
        'agents': {
            'execution': {'temperature': custom_params['temp']},
            # ... other settings
        }
    }

    filename = f"custom_{maze_id}_{custom_params['temp']}.yaml"
    with open(f"custom_configs/{filename}", 'w') as f:
        yaml.dump(base_config, f)

# Use with batch runner
# python batch_runner.py --config_dir custom_configs
```

### Integration with Analysis Tools

```python
import json
import pandas as pd

# Load batch results for analysis
with open('batch_results/execution_results.json', 'r') as f:
    results = json.load(f)

# Convert to DataFrame for analysis
df = pd.DataFrame(results)

# Analyze success rates by maze
success_by_maze = df.groupby('maze_id')['status'].apply(
    lambda x: (x == 'completed').mean()
)

print(success_by_maze)
```

### Cluster Integration

For running on compute clusters:

```bash
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --time=04:00:00

source .venv/bin/activate
cd orchestrator_maze_implementation

python batch_runner.py --parallel 16 --timeout 600 --output_dir cluster_results
```

### Monitoring and Alerts

Set up monitoring for long-running batches:

```bash
# Run with logging to file
python batch_runner.py --parallel 4 2>&1 | tee batch_execution.log &

# Monitor progress
tail -f batch_results/batch_runner.log

# Check completion
grep "BATCH EXECUTION SUMMARY" batch_execution.log
```

---

## Support

For issues or questions:

1. Check the troubleshooting section above
2. Review logs in `batch_results/batch_runner.log`
3. Use debug mode to isolate problems
4. Check configuration file syntax and paths

**Happy experimenting!** 🚀🤖🧩
