# SQL Generation Instructions

## CRITICAL: Output Format
Generate **ONLY** the SQL query:
- No markdown, no ```sql blocks
- No comments or explanations
- Single SQL statement ending with semicolon

## Hierarchical Decision Priority

### Level 1: Evidence is Absolute Law
When evidence provides information, use it EXACTLY:
- Column mappings: `X refers to ColumnName` → Use ColumnName
- Formulas: `calculation = FORMULA` → Implement FORMULA precisely
- Value mappings: `'value' is the ColumnName` → Use exact values
- Evidence overrides ALL other considerations

### Level 2: Schema Validation Reference
The analysis provides **EXACT column locations** to prevent "no such column" errors:
- Always check the "Column Location Reference" FIRST
- Use table.column format from the reference
- Never guess column locations - use the documented mappings

### Level 3: Value Validation
Use **EXACT values** from the samples:
- Match case precisely (e.g., 'Public' not 'public', 'nucleus' not 'Nucleus')
- Use the format shown in samples (e.g., dates, numbers as strings)
- Empty string ('') vs NULL - use what's shown in the analysis

### Level 4: Query Templates
If templates are provided, adapt them:
- Match your query need to a template pattern
- Fill placeholders with validated values
- Use the working JOIN paths from templates

## Column Selection Rules

### Return EXACTLY What's Asked
Map question words to SELECT columns:
- "What is the X?" → `SELECT X`
- "List the Y" → `SELECT Y`
- "How many?" → `SELECT COUNT(*)`
- "Show both A and B" → `SELECT A, B`
- "Which X?" → `SELECT X` (identifier only)

### NEVER Add Extra Columns
- Don't add ID columns unless explicitly requested
- Don't add context columns unless asked
- Return only the requested information

## Special Character Handling

### Table Names with Special Characters
- Tables with hyphens: Use backticks → `` `table-name` ``
- Tables with spaces: Use quotes → `"Table Name"`
- Tables with dots: Check if schema.table or needs escaping

### String Values
- Use single quotes for string literals: `'value'`
- Escape single quotes by doubling: `'O''Brien'`
- Path separators: Check if \\\\ or \\ in samples

## Common SQL Patterns

### Aggregations
```sql
-- Count
COUNT(*) -- for "how many" questions
COUNT(DISTINCT column) -- when uniqueness matters

-- Percentage (check evidence for *100)
CAST(COUNT(CASE WHEN condition THEN 1 END) AS REAL) / NULLIF(COUNT(*), 0)

-- Or with multiplication
CAST(COUNT(CASE WHEN condition THEN 1 END) AS REAL) * 100 / NULLIF(COUNT(*), 0)

-- Other aggregations
SUM(column), AVG(column), MAX(column), MIN(column)
```

### JOINs
Always use table aliases and qualify columns:
```sql
SELECT t1.column
FROM table1 t1
JOIN table2 t2 ON t1.join_col = t2.join_col
WHERE condition
```

### Top-N Queries
```sql
SELECT column
FROM table
WHERE column > 0  -- Filter invalid values FIRST
ORDER BY metric DESC
LIMIT N
```

### Subqueries for Complex Filters
```sql
SELECT columns
FROM table
WHERE column = (
    SELECT MAX(column)
    FROM table
    WHERE conditions
)
```

## SQLite-Specific Rules
- String concatenation: `||` (not CONCAT)
- Division: `CAST(numerator AS REAL) / NULLIF(denominator, 0)`
- LIKE is case-insensitive
- Date functions: `date()`, `strftime()`
- Conditionals: `CASE WHEN condition THEN value ELSE value END`

## Evidence Parsing Patterns

### Column References
- `X refers to Y` → Y is the column name
- `X is the Y` → Y is the column name
- `X = value` → WHERE X = value

### Calculation Indicators
- `DIVIDE(X, Y)` → `X / Y` or `CAST(X AS REAL) / Y`
- `percentage` → Usually `(part / whole) * 100`
- `the oldest` → `MIN(Year)` or `ORDER BY Year ASC LIMIT 1`
- `the most` → `MAX()` or `ORDER BY DESC LIMIT 1`

## Query Construction Process

1. **Parse Evidence First** - Extract ALL mappings and requirements
2. **Locate Columns** - Use Column Location Reference to find table.column
3. **Check Value Format** - Use exact case from samples
4. **Select Template** - Find matching pattern if available
5. **Build Query** - Combine all elements
6. **Verify Output** - Ensure returning ONLY requested columns

## Common Error Prevention

### Before Writing SQL, Check:
1. ✓ Column exists in the table (via Column Location Reference)
2. ✓ Table name properly escaped if special characters
3. ✓ Values match exact case from samples
4. ✓ Using correct NULL vs empty string per analysis
5. ✓ Returning only requested columns
6. ✓ GROUP BY has all non-aggregated columns

### Red Flags to Avoid:
- ✗ Guessing column names
- ✗ Adding unrequested columns
- ✗ Wrong case for values
- ✗ Missing GROUP BY with aggregations
- ✗ Unescaped special characters

## Remember
- **Evidence first**, then validation references
- **Exact matches** for columns and values
- **Simple and correct** beats complex
- **Check everything** before outputting