# SQL Generation Instructions

## CRITICAL: Evidence is Your Primary Guide

**When evidence is provided, it overrides everything else:**
1. Parse evidence for column mappings: `X refers to ColumnName`
2. Use exact formulas: `calculation = FORMULA`  
3. Apply value mappings: `'value' is the ColumnName`
4. Follow evidence literally, even if it seems suboptimal

## Output Requirements

Generate **ONLY** the SQL query:
- No markdown, comments, or explanations
- Single SQL statement ending with semicolon
- Clean, executable SQL

## Column Selection Protocol

### Rule 1: Return EXACTLY What's Asked
- "List the X" → SELECT X
- "What is the Y?" → SELECT Y
- "Show me X and Y" → SELECT X, Y
- **NEVER** add ID columns unless explicitly requested
- **NEVER** add extra context columns

### Rule 2: Question Word Mapping
- "Which" + entity → return identifying column(s) only
- "How many" → COUNT(*) only
- "What percentage" → calculation result only
- "List all" → all relevant columns of the entity

## Template Usage

If templates are provided in the analysis:
1. Match your query pattern to a template
2. Fill placeholders with actual values
3. Adjust for specific requirements

Example template usage:
```
Template: SELECT {columns} FROM {table} WHERE {condition} ORDER BY {sort} LIMIT {n}
Question: Top 3 shortest players
Result: SELECT firstName, lastName FROM players WHERE height > 0 ORDER BY height ASC LIMIT 3
```

## Evidence Parsing Rules

### Column Mappings
- `X refers to ColumnName` → Use ColumnName for X
- `X is the ColumnName` → Use ColumnName for X  
- `ColumnName = value` → WHERE ColumnName = value

### Calculation Formulas
- `calculation = FORMULA` → Use FORMULA exactly
- `DIVIDE(X, Y)` → X / Y or CAST(X AS REAL) / Y
- `percentage` → Usually (part / whole) * 100

### Special Indicators
- `the oldest` → MIN(Year) or ORDER BY Year ASC LIMIT 1
- `the most` → MAX() or ORDER BY DESC LIMIT 1
- `average` → AVG()
- `total` → SUM()

## Aggregation Patterns

### Simple Aggregations
```sql
COUNT(*) -- for "how many"
SUM(column) -- for "total"
AVG(column) -- for "average"
MAX(column) -- for "maximum"
MIN(column) -- for "minimum"
```

### Percentage Calculations
```sql
-- Pattern 1: Percentage of total
(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM table))

-- Pattern 2: Conditional percentage
(SUM(CASE WHEN condition THEN 1 ELSE 0 END) * 100.0 / COUNT(*))

-- Pattern 3: Direct division
CAST(numerator AS REAL) * 100 / denominator
```

### Top-N Queries
```sql
-- Always filter invalid values first
WHERE column > 0  -- not WHERE column IS NOT NULL
ORDER BY column [ASC|DESC]
LIMIT n
```

## JOIN Patterns

### Standard Join
```sql
SELECT t1.column
FROM table1 t1
JOIN table2 t2 ON t1.key = t2.key
WHERE condition
```

### Multi-table Join Chain
```sql
FROM table1 t1
JOIN table2 t2 ON t1.key = t2.key
JOIN table3 t3 ON t2.key = t3.key
```

## Common Pitfalls to Avoid

1. **Column Over-selection**: Return ONLY requested columns
2. **NULL vs Zero**: Use `> 0` not `IS NOT NULL` for measurements
3. **Missing CAST**: Use CAST(x AS REAL) for division
4. **Wrong Aggregation Scope**: Aggregate at the right level
5. **Evidence Ignorance**: Always prioritize evidence mappings

## Query Construction Process

1. **Parse Evidence First** - Extract all mappings and formulas
2. **Identify Requested Columns** - What exactly should be returned?
3. **Match to Template** - Find appropriate pattern if provided
4. **Apply Evidence Mappings** - Replace placeholders with actual values
5. **Verify Column Selection** - Ensure ONLY requested columns returned

## Remember

- **Evidence is law** - Follow it exactly
- **Precision over complexity** - Simple, correct queries
- **Templates are guides** - Adapt as needed
- **Return only what's asked** - No extra columns