---
name: research-implement
description: "[Read when prompt contains /research-implement]"
metadata:
  {
    "agent-runtime":
      {
        "emoji": "💻",
        "requires": { "bins": ["python3", "uv"] },
      },
  }
---

# Research Implement

**Don't ask permission. Just do it.**

**Workspace:** `$W` = working directory provided in the task parameter.

## Prerequisites

| File | Source |
|------|--------|
| `$W/plan_res.md` | /research-plan |
| `$W/survey_res.md` | /research-survey |
| `$W/repos/` (optional) | reference code |

**If `plan_res.md` is missing, STOP:** "Run /research-plan first to produce the implementation plan."

## Output

| File | Content |
|------|---------|
| `$W/project/` | Complete runnable code. |
| `$W/ml_res.md` | Implementation report (with real execution results). |

---

## Workflow

### Step 1: Read the plan

Read `$W/plan_res.md` and extract:

- The full component list.
- Dataset information.
- Training parameters.

### Step 2: Create the project structure

```
$W/project/
  model/          # model components (one file per component)
  data/           # data loading
  training/       # training loop + loss
  testing/        # evaluation
  utils/          # utility functions
  run.py          # entry point (must print [RESULT] lines)
  requirements.txt
```

### Step 3: Implement the code

Implement in this order, validating immediately after each step:

**3a. requirements.txt** — list every dependency, pin major versions.

**3b. Data pipeline**
```bash
cd $W/project && uv venv .venv && source .venv/bin/activate
uv pip install -r requirements.txt
python3 -c "from data.dataset import *; print('data OK')"
```
Validate: import has no errors.

**3c. Model architecture**
```bash
python3 -c "from model import *; import torch; x = torch.randn(2, ...); print(model(x).shape)"
```
Validate: output shape is correct.

**3d. Loss + training loop**

**3e. Evaluation logic**

**3f. run.py** — must include:
```python
print(f"[RESULT] train_loss={train_loss:.6f}")
print(f"[RESULT] val_metric={val_metric:.6f}")
print(f"[RESULT] elapsed={elapsed:.1f}s")
print(f"[RESULT] device={device}")
```

### Step 4: Set up the environment + run

```bash
cd $W/project
uv venv .venv
source .venv/bin/activate

# Auto-detect dependency format
if [ -f "pyproject.toml" ]; then
    uv pip install -e .
elif [ -f "requirements.txt" ]; then
    uv pip install -r requirements.txt
fi

# 2-epoch validation
python3 run.py --epochs 2
```

### Step 5: Verify the execution result

**After running, you must:**

1. Read the full stdout/stderr.
2. Confirm that `[RESULT]` lines exist.
3. Confirm that the loss is not NaN/Inf.
4. Confirm that the loss shows a downward trend (even slight).

**If execution fails:**

- Read the error message.
- Fix the code.
- Run again.
- Retry at most 3 times.

### Step 6: Write the report

Write `$W/ml_res.md`:

```markdown
# Implementation Report

## Data Source
- Dataset: {name} — real / mock (reason)
- If mock: steps to obtain real data: [...]

## Components Implemented
- {module}: {description}

## Quick Validation Results (from execution log)
- Epochs: 2
- [RESULT] train_loss={copy from execution output}
- [RESULT] val_metric={copy from execution output}
- [RESULT] elapsed={copy from execution output}
- [RESULT] device={copy from execution output}

> The values above are quoted directly from the code-execution output.
> If any value cannot be verified from the execution log, mark it ⚠️ UNVERIFIED.

## Deviations from Plan
- {changes and why}

## Known Issues
- {issues}
```

---

## Critical Rules

1. **No fabricated results.** Every number must come from real execution output. If execution fails, report the failure.
2. **No global pip.** Always isolate with `uv venv`.
3. **No direct `import` from `repos/`** — adapt and rewrite.
4. **Mock data must be flagged** — `# MOCK DATA: <reason>` in the code, declared in the report.
5. **`run.py` must print `[RESULT]` lines**, and the report must quote them.
6. After 3 failed retries, write a failure report and stop.
