# Calcudoko Game

Calcudoku is a mathematical logic game where players need to fill a grid with numbers so that each row and column contains no repeating digits, and the numbers within each region satisfy a given mathematical operation to reach a target value.

## Game Rules

1. Fill an n×n grid with numbers from 1 to n
2. Numbers cannot repeat in any row or column
3. The grid is divided into several regions (cages), each with an operation rule and a target value
4. Numbers within each region must satisfy the operation rule, resulting in the target value
5. Supported operations include:
   - Addition (+)
   - Subtraction (-)
   - Multiplication (×)
   - Division (÷)

## Example

```
+---+---+---+---+
|   |   |   |   |
| 3 | 2 | 1 | 4 |
|   |   |   |   |
+---+---+---+---+
|   |   |   |   |
| 4 | 1 | 2 | 3 |
|   |   |   |   |
+---+---+---+---+
|   |   |   |   |
| 2 | 3 | 4 | 1 |
|   |   |   |   |
+---+---+---+---+
|   |   |   |   |
| 1 | 4 | 3 | 2 |
|   |   |   |   |
+---+---+---+---+
```

# File Description
## Core Scripts
- calcudoko.py
  - Function: Generate Calcudoku game puzzles
  - Main class: CalcudokoGenerator
  - Key methods:
    - generate(): Generate game data
    - extract_answer(): Extract answers from model responses

- calcudoko_prompt.py
  - Function: Generate model prompts
  - Main class: CalcudokoPrompt
  - Key methods:
    - generate_prompt(): Generate prompts
    - generate_example(): Generate examples

- calcudoko_verifier.py
  - Function: Verify answer correctness
  - Main class: CalcudokoVerifier
  - Key methods:
    - verify(): Verify if the answer satisfies all rules
    - parse_grid_from_answer(): Extract grid data from model answers
    - extract_answer(): Extract answer format from model responses

## Run Script
- run.sh
  - Function: One-click execution of the complete process
  - Included steps:
    1. Automatically generate game data for three different grid sizes (4x4, 5x5, 6x6)
    2. Generate 1000 data entries for each size
    3. Data is automatically stored in corresponding directories
  - No parameters needed, just run directly

## Usage

### 1. Use the one-click script to generate all data

```bash
./run.sh
```

This will generate:
- 1000 entries of 4x4 grid data (stored in data/grid_size_4/puzzles_grid4_n1000.jsonl)
- 1000 entries of 5x5 grid data (stored in data/grid_size_5/puzzles_grid5_n1000.jsonl)
- 1000 entries of 6x6 grid data (stored in data/grid_size_6/puzzles_grid6_n1000.jsonl)

### 2. Generate data for a specific grid size separately

```bash
python scripts/calcudoko.py \
    --num_of_data 100 \
    --grid_size 4
```

Parameter description:
- `--num_of_data`: Number of data entries to generate
- `--grid_size`: Grid size (supports 4, 5, 6)

### Important Notes

1. The "answer" field in the generated test data is only a reference answer, not the only correct answer. For the same puzzle, there may be multiple valid solutions that satisfy all rules.
2. When verifying model answers, they are not compared with the reference answer, but checked by `CalcudokoVerifier` to ensure they satisfy all the following rules:
   - Sudoku rule: Each row and column contains numbers 1 to N without repetition
   - Region rule: Numbers within each region have no repetition
   - Operation rule: The result of the operation on numbers within each region equals the target value
   - Format rule: The answer must be a valid number matrix, and each element must be an integer from 1 to N
   - Dimension rule: The dimensions of the answer matrix must match the required grid size of the puzzle
3. As long as the answer provided by the model satisfies all the above rules, it will be considered correct.
4. Our evaluation system supports parsing answers in different formats; even if the model output format is not completely standard, the system will attempt to extract valid answers.