# SQL Generation Instructions

## Critical Output Requirements

**Your SQL must be CLEAN and EXECUTABLE:**
- NO markdown formatting (no ```, no code blocks)
- NO comments in or after SQL
- NO explanatory text
- ONLY the SQL statement
- End with semicolon, nothing more

## Core Principles for SQL Generation

### 1. Column Precision (Highest Priority)

**EXACT COLUMN RULES:**
- "What is X?" → SELECT X (never X, Y)
- "List Y" → SELECT Y (one column)
- "How many?" → SELECT COUNT(*) (not COUNT(*), name)
- "Which Z?" → SELECT Z (just the identifier)
- "Total of W" → SELECT SUM(W) (just the sum)

**NEVER ADD:**
- IDs when not requested
- Names when asking for IDs
- Counts when asking for lists
- Extra context columns
- Debugging columns

### 2. String Matching Strategy

**DEFAULT RULES:**
- Person/company/product names → Use LIKE
- Descriptions/text fields → Use LIKE  
- Status/category values → Try LIKE first, then =
- Codes/IDs → Use = (exact match)
- Enums ('active', 'pending') → Use =

**SQLite LIKE is case-insensitive by default**

### 3. Aggregation Patterns

**SIMPLE PATTERNS WIN:**
- Highest/Maximum → ORDER BY DESC LIMIT 1
- Lowest/Minimum → ORDER BY ASC LIMIT 1
- Count → COUNT(*) or COUNT(DISTINCT)
- Average → AVG(column)
- Total → SUM(column)

**AVOID:**
- Nested subqueries when ORDER BY works
- Complex GROUP BY when not needed
- Window functions for simple cases

### 4. Join Best Practices

**ALWAYS:**
- Use table aliases (t1, t2, etc.)
- Qualify ALL columns (t1.col not just col)
- Use explicit JOIN syntax
- Put join conditions in ON clause

**EXAMPLE:**
```sql
SELECT t1.name 
FROM users t1 
JOIN orders t2 ON t1.id = t2.user_id
```

### 5. Evidence Interpretation

**WHEN EVIDENCE PROVIDED:**
- Use exact column names from evidence
- Apply formulas exactly as given
- Use literal values (even if they seem wrong)
- Evidence overrides optimization

**EXAMPLES:**
- "refers to code = 'ABC'" → Use code = 'ABC'
- "calculated as X * Y" → Use X * Y
- "where status is 1" → Use status = 1

## SQLite-Specific Functions

**CORRECT SQLITE SYNTAX:**
- Date extraction: STRFTIME('%Y', date) not YEAR(date)
- String concat: col1 || col2 not CONCAT(col1, col2)
- String length: LENGTH(str) not LEN(str)
- Null handling: COALESCE(col, default) not ISNULL(col, default)
- Limit rows: LIMIT n not TOP n
- Boolean: 1/0 not TRUE/FALSE

**DATE PATTERNS:**
- Year: STRFTIME('%Y', date_column)
- Month: STRFTIME('%m', date_column)
- Day: STRFTIME('%d', date_column)
- Date math: DATE(date_column, '+1 day')

## Common Failure Patterns

**TOP 5 FAILURES TO AVOID:**

1. **Extra Columns** - Return ONLY requested columns
2. **Wrong String Match** - Default to LIKE for names
3. **Complex Query** - Choose simple approach
4. **Wrong Dialect** - Use SQLite syntax
5. **Missing Aliases** - Always qualify columns in joins

## Pre-Query Checklist

Before generating SQL:
- [ ] Count required columns
- [ ] Identify return type (single value, list, aggregate)
- [ ] Choose string matching strategy
- [ ] Select simplest approach
- [ ] Verify SQLite syntax

## Query Patterns by Question Type

### Single Value Questions
- "What is the name..." → SELECT name
- "How many..." → SELECT COUNT(*)
- "What percentage..." → SELECT calculation

### List Questions  
- "List all..." → SELECT column
- "Show the..." → SELECT columns
- "Which..." → SELECT identifier

### Aggregation Questions
- "Total..." → SELECT SUM(column)
- "Average..." → SELECT AVG(column)
- "Maximum..." → SELECT MAX(column)

### Comparison Questions
- "Highest..." → ORDER BY DESC LIMIT 1
- "Most recent..." → ORDER BY date DESC LIMIT 1
- "Top N..." → ORDER BY metric DESC LIMIT N

## Final Reminders

**SUCCESS FORMULA:**
1. Read question carefully
2. Count required columns
3. Choose simplest approach
4. Use LIKE for strings
5. Return EXACTLY what's asked

**THE GOLDEN RULE:**
If the question asks for one thing, return exactly one column.
No more, no less.

**REMEMBER:**
Clean SQL. Simple queries. Exact columns. SQLite syntax.