# SQL Generation Instructions - Evidence-First Execution

## THE GOLDEN RULE
**Evidence is LAW. Question defines OUTPUT. Nothing else matters.**

## Evidence Compliance (MANDATORY)

### Rule 1: Evidence ALWAYS Overrides Intuition
When evidence says something, USE IT EXACTLY:
```sql
-- Evidence: "more than 3 awards refers to count(award) > 3"
-- Question: "coaches who received more than 1 award"
-- WRONG: COUNT(award) > 1  (following question)
-- CORRECT: COUNT(award) > 3  (following evidence)
```

### Rule 2: Evidence Defines Column Mappings
```sql
-- Evidence: "full name refers to f_name, l_name"
-- WRONG: f_name || ' ' || l_name AS full_name
-- CORRECT: f_name, l_name
```

### Rule 3: Evidence Formulas Are Exact
```sql
-- Evidence: "percentage = Divide(Count(X), Count(Y)) * 100"
-- Use EXACTLY: CAST(COUNT(X) AS REAL) * 100 / COUNT(Y)
-- Never modify the formula
```

## Output Type Rules (STRICT)

### Determine Return Type From Question Keywords

#### "What is the percentage..."
```sql
-- ALWAYS return a single percentage value
SELECT CAST(COUNT(CASE WHEN condition THEN 1 END) AS REAL) * 100 / COUNT(*)
```

#### "How many..."
```sql
-- ALWAYS return COUNT
SELECT COUNT(*)  -- or COUNT(DISTINCT ...) if uniqueness implied
```

#### "List..." or "What are..."
```sql
-- Return actual values, NOT counts
SELECT column_values  -- NOT COUNT(*)
```

#### "Describe..." or "Give me X and Y"
```sql
-- Return EXACTLY the columns mentioned
-- If evidence says "full name = f_name, l_name"
SELECT f_name, l_name  -- NOT concatenated
```

## Column Selection Precision

### NEVER Add Helpful Extras
```sql
-- Question: "List the itunes_url"
-- WRONG: SELECT title, itunes_url  (adding title)
-- CORRECT: SELECT itunes_url
```

### Respect Column Format Requests
```sql
-- "full names" with evidence "f_name, l_name"
SELECT f_name, l_name  -- Two columns

-- "full name" without specific evidence
SELECT f_name || ' ' || l_name AS full_name  -- One column
```

### Use DISTINCT Appropriately
- Use DISTINCT when listing unique entities
- Don't use DISTINCT when counting (unless COUNT(DISTINCT ...))
- GROUP BY when the evidence shows GROUP BY

## Aggregation Context Rules

### Per-Entity Aggregation
```sql
-- "throughout his career" or "overall" → SUM/AVG across all records
SELECT playerID, SUM(points)
GROUP BY playerID
```

### Single Record Selection
```sql
-- "the most" or "the highest" → ORDER BY ... LIMIT 1
SELECT ... ORDER BY metric DESC LIMIT 1
```

### Temporal Aggregation
```sql
-- "in 1997" → WHERE year = 1997
-- "from 1990 to 2000" → WHERE year BETWEEN 1990 AND 2000
-- "throughout allstar appearances" → No year filter, aggregate all
```

## Date Format Handling

### Detect Format From Data
```sql
-- MM/DD/YYYY format: "8/29/2013"
WHERE date LIKE '8/%/2013'  -- for August 2013

-- YYYY-MM-DD format: "2013-08-29"
WHERE date LIKE '2013-08-%'  -- for August 2013

-- With timestamp: "8/29/2013 14:24"
WHERE date LIKE '8/29/2013%'  -- for specific date
```

### "After" vs "In" Interpretation
```sql
-- "after August 2013" means Sept 2013 onwards
-- "in August 2013" means only August 2013
-- Be precise about boundaries
```

## Join Rules

### Always Verify Join Paths
```sql
-- Use all intermediate tables when needed
FROM table1
JOIN table2 ON table1.id = table2.table1_id
JOIN table3 ON table2.id = table3.table2_id
```

### Temporal Joins Need Year Matching
```sql
-- For sports/temporal data
JOIN ON t1.tmID = t2.tmID AND t1.year = t2.year
```

## Common Patterns to Remember

### Percentage of Subset
```sql
-- "percentage of X that are Y"
SELECT CAST(COUNT(CASE WHEN Y THEN 1 END) AS REAL) * 100 / COUNT(*)
```

### Difference Calculations
```sql
-- "difference between A and B"
SELECT COUNT(CASE WHEN type='A' THEN 1 END) -
       COUNT(CASE WHEN type='B' THEN 1 END)
```

### Running Totals
```sql
-- "cumulative" or "running total"
SELECT SUM(value) OVER (ORDER BY date)
```

## Validation Checklist

Before submitting SQL, verify:

1. ✓ **Evidence Check**: Are ALL evidence constraints applied EXACTLY?
2. ✓ **Output Check**: Does SELECT match what the question asks for?
3. ✓ **Column Check**: Returning ONLY requested columns, in requested format?
4. ✓ **Aggregation Check**: Is the aggregation level correct?
5. ✓ **Join Check**: Are all necessary joins included?

## Critical Reminders

- **Evidence > Question** for WHERE conditions
- **Question > Evidence** for SELECT columns (unless evidence specifies)
- **Never concatenate** when separate columns are requested
- **Always calculate percentages** when "percentage" is mentioned
- **Return counts** when "how many" is asked
- **List values** when "list" is requested
- **One result** when "the most/least" is mentioned

## Final Rule

When in doubt:
1. Check evidence for constraints → Apply them EXACTLY
2. Check question for output format → Return EXACTLY that
3. Nothing else matters