# SQL Generation Instructions

## Core Requirements

Generate clean, executable SQL:
- No markdown formatting or code blocks
- No comments or explanatory text
- Only the SQL statement
- End with semicolon

## Progressive SQL Building Process

When generating SQL, follow these steps mentally:

### Step 1: Identify Core Elements
- Which tables contain the requested data?
- What columns are explicitly mentioned?
- What relationships exist between tables?

### Step 2: Parse Evidence Carefully
When evidence is provided:
- Extract exact column names (e.g., "department refers to organ")
- Identify formulas exactly as written
- Note specific values and their formats
- Use evidence column names even if they seem wrong

### Step 3: Build SQL Components
1. **SELECT clause**: Return ONLY what's requested
   - "What is X?" → SELECT X
   - "How many?" → SELECT COUNT(*)
   - "List Y" → SELECT Y (nothing else)

2. **FROM clause**: Start with primary table

3. **JOIN clauses**: Use proper syntax
   - Always use table aliases
   - Qualify ALL column names with aliases
   - Follow the join paths from the analysis

4. **WHERE clause**: Apply filters precisely
   - Use exact values from database (case-sensitive)
   - Apply evidence formulas literally
   - Include IS NOT NULL when needed for ORDER BY

5. **GROUP BY**: Required for aggregations
   - Include all non-aggregated SELECT columns
   - Add before ORDER BY

6. **ORDER BY and LIMIT**: Final result shaping
   - Use DESC for "highest", "most", "maximum"
   - LIMIT 1 for single result requests

## Critical Rules

### Value Matching
- **Case matters**: 'JOHN' ≠ 'John' ≠ 'john'
- Use exact values from the database analysis
- String values need quotes, numbers don't

### Column Selection
- Return ONLY requested columns
- Never add extra "helpful" columns
- Count returns just the count, not what's being counted

### Evidence Supremacy
When evidence is provided, it overrides everything:
- Use evidence column names exactly
- Apply evidence formulas precisely
- Evidence relationships supersede schema assumptions

### SQLite Specific Syntax
- String concatenation: `||` operator
- Date extraction: `strftime('%Y', date_column)`
- Case-insensitive LIKE by default
- Quote identifiers with spaces using double quotes

## Common Patterns

### Counting
```
How many X? → SELECT COUNT(*) or COUNT(DISTINCT column)
```

### Finding Maximum/Minimum
```
The most/highest X → ORDER BY X DESC LIMIT 1
Which X has the most Y → GROUP BY X ORDER BY COUNT(Y) DESC LIMIT 1
```

### Aggregations
```
Total → SUM(column)
Average → AVG(column)
Maximum → MAX(column)
Minimum → MIN(column)
```

### Yes/No Questions
```
Is X true? → Return 1/0 or 'yes'/'no' based on evidence
```

## Final Checklist
- [ ] Using exact table/column names from analysis?
- [ ] Following evidence literally?
- [ ] Returning ONLY requested columns?
- [ ] Values match database exactly (case-sensitive)?
- [ ] Proper JOIN syntax with aliases?
- [ ] Clean SQL with no formatting?

Remember: Simple, precise, evidence-driven.