# LinearizeLLM Framework Workflow Documentation

## Architecture

The framework follows a modular architecture with the following main components:

```
LinearizeLLM Framework
├── Core Components (src/core/)
│   ├── LinearizeLLMWorkflow - Main orchestration class
│   ├── NonLinearPatternExtractor - Pattern detection
│   ├── CodeConverter - Code generation
│   └── OptimizationExecutor - Execution engine
├── Agents (src/agents/)
│   ├── MIPCoordinator - Pattern reformulation coordination
│   └── PatternReformulationAgents - Specialized reformulation agents
├── Utils (src/utils/)
│   ├── LLMModelManager - Multi-provider LLM support
│   └── APIKeyManager - API key management
└── Scripts (src/scripts/)
    ├── run_linearizellm_data.py - Main execution script
    └── browse_results.py - Results analysis
```

## Workflow Steps

The LinearizeLLM workflow consists of 6 main steps:

### Step 1: LaTeX Model Loading and Parameter Extraction
**Component**: `LinearizeLLMWorkflow.extract_latex_model_from_linearizellm_tex()`

- **Input**: LaTeX file (`.tex`) and parameters file (`parameters.json`)
- **Process**: 
  - Extracts mathematical model from LaTeX format
  - Loads problem parameters (scalars, vectors, matrices)
  - Handles dimensionality information
- **Output**: Structured model and parameter dictionary
- **Files Saved**: `latex_model.tex`, `extracted_parameters.json`

### Step 2: Nonlinear Pattern Detection
**Component**: `NonLinearPatternExtractor`

- **Input**: LaTeX model text
- **Process**:
  - Uses LLM to identify nonlinear patterns
  - Detects bilinear terms (e.g., `x * y`)
  - Identifies min/max functions (e.g., `min(x, y)`, `max(x, y)`)
  - Finds absolute value expressions (e.g., `|x|`)
  - Locates quotient expressions (e.g., `x/y`)
- **Output**: Structured pattern analysis with pattern types and locations
- **Files Saved**: `extracted_patterns.txt`

### Step 3: Pattern-based MIP Linearization
**Component**: `MIPCoordinator` + `PatternReformulationAgents`

- **Input**: LaTeX model + detected patterns + parameters
- **Process**:
  - **Bilinear Patterns**: Applies appropriate linearization techniques
    - Introduces new variables for bilinear terms
    - Adds constraints to maintain linearity
  - **Min/Max Patterns**: Uses appropriate linearization approach
    - Applies suitable linearization techniques
    - Adds constraints to enforce mathematical equivalence
  - **Absolute Value**: Linearization techniques
    - Uses appropriate methods to handle positive/negative cases
    - Adds constraints for absolute value representation
  - **Quotient Patterns**: Linearization techniques
    - Introduces new variables for ratios
    - Adds constraints to maintain mathematical equivalence
  - **Monotone Transformations**: Linearization techniques
    - Handles logarithmic, exponential, etc.
    - Applies appropriate linearization methods for monotone functions
    - Adds constraints to maintain mathematical equivalence
- **Output**: Linearized model with reformulation markings
- **Files Saved**: `linearized_model.tex`

### Step 4: Code Generation
**Component**: `CodeConverter`

- **Input**: Linearized model + parameters + problem ID
- **Process**:
  - Generates Gurobi Python code
  - Incorporates extracted parameters
  - Creates variable declarations
  - Implements constraints and objective function
  - Adds solver configuration and execution logic
- **Output**: Executable Gurobi Python script
- **Files Saved**: `gurobi_code.py`

### Step 5: Code Validation
**Component**: `CodeConverter.validate_code()`

- **Input**: Generated code + original model + parameters
- **Process**:
  - Validates code structure and syntax
  - Checks parameter integration
  - Verifies constraint representation
  - Ensures objective function correctness
- **Output**: Validation report with issues and recommendations
- **Files Saved**: `code_validation.txt`

### Step 6: Optimization Execution
**Component**: `OptimizationExecutor`

- **Input**: Generated Gurobi code
- **Process**:
  - Executes the optimization model
  - Captures solver output and results
  - Handles errors and timeouts
  - Extracts solution values and statistics
- **Output**: Optimization results (objective value, variables, status)
- **Files Saved**: `optimization_results.json`

## Data Flow

```
LaTeX File (.tex) + Parameters (.json)
           ↓
    [Step 1] Model Extraction
           ↓
    Structured Model + Parameters
           ↓
    [Step 2] Pattern Detection
           ↓
    Detected Nonlinear Patterns
           ↓
    [Step 3] Linearization (if needed)
           ↓
    Linearized Model
           ↓
    [Step 4] Code Generation
           ↓
    Gurobi Python Code
           ↓
    [Step 5] Code Validation
           ↓
    [Step 6] Optimization Execution
           ↓
    Optimization Results
```

## Supported Nonlinear Patterns

### 1. Bilinear Terms
- **Pattern**: `x * y` where x and y are variables
- **Linearization**: Appropriate linearization techniques
- **New Variables**: `w = x * y`
- **Constraints**: Linearization constraints

### 2. Min/Max Functions
- **Pattern**: `min(x, y)`, `max(x, y)`
- **Linearization**: Appropriate linearization techniques
- **New Variables**: As required by chosen technique
- **Constraints**: Linearization constraints

### 3. Absolute Value
- **Pattern**: `|x|`
- **Linearization**: Appropriate linearization techniques
- **New Variables**: As required by chosen technique
- **Constraints**: Linearization constraints

### 4. Quotient Expressions
- **Pattern**: `x/y`
- **Linearization**: Appropriate linearization techniques
- **New Variables**: `r = x/y`
- **Constraints**: Linearization constraints

### 5. Monotone Transformations
- **Pattern**: `log(x)`, `exp(x)`, etc. 
- **Linearization**: Appropriate linearization techniques
- **New Variables**: As required by chosen technique
- **Constraints**: Linearization constraints

## LLM Integration

The framework supports multiple LLM providers:

### Supported Providers
- **OpenAI**: GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5-turbo, o3
- **Google**: Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-1.5-Pro, Gemini-1.5-Flash

### Usage in Components
- **Pattern Detection**: Uses LLM to identify nonlinear patterns
- **Reformulation**: Uses specialized agents for each pattern type
- **Code Generation**: Uses LLM to generate Gurobi code
- **Validation**: Uses LLM to validate generated code

## File Structure

### Input Structure
```
data/LinearizeLLM_data/instances_linearizellm/
├── problem_1/
│   ├── problem_1.tex
│   └── parameters.json
├── problem_2/
│   ├── problem_2.tex
│   └── parameters.json
└── ...
```

### Output Structure
```
data/results/
├── problem_problem_1/
│   └── run_20241201_143022/
│       ├── models/
│       │   ├── latex_model.tex
│       │   ├── linearized_model.tex
│       │   └── extracted_parameters.json
│       ├── code/
│       │   └── gurobi_code.py
│       ├── steps/
│       │   ├── extracted_patterns.txt
│       │   └── code_validation.txt
│       └── optimization_results.json
└── ...
```

## Usage Examples

### Command Line Usage

```bash
# Process a specific problem
python run_linearizellm_data.py --file blend_problem.tex

# Process with specific LLM model
python run_linearizellm_data.py --file blend_problem.tex --model gemini-2.5-flash

# Process with custom configuration
python run_linearizellm_data.py --file blend_problem.tex --model-config config.json

# Process multiple problems
python run_linearizellm_data.py --files "blend_problem.tex,diet_problem.tex"

# Process first N problems
python run_linearizellm_data.py --first 5
```

### Programmatic Usage

```python
from src.core.agent_pipeline import LinearizeLLMWorkflow

# Initialize workflow
workflow = LinearizeLLMWorkflow(
    tex_path="problem.tex",
    problem_id="my_problem",
    llm_model="gpt-4o",
    save_results=True
)

# Execute workflow
results = workflow.run(verbose=True)

# Access results
print(f"Success: {results['optimization_results']['success']}")
print(f"Objective Value: {results['optimization_results']['optimization_results']['objective_value']}")
```

## Error Handling

The framework includes comprehensive error handling:

1. **API Key Management**: Automatic detection and prompting for API keys
2. **Model Validation**: Checks for valid LLM configurations
3. **Pattern Processing**: Graceful handling of invalid patterns
4. **Code Generation**: Validation of generated code structure
5. **Optimization Execution**: Error capture and logging

## Performance Considerations

- **LLM Calls**: Each step may require multiple LLM API calls
- **Pattern Processing**: Linearization complexity depends on pattern count
- **Code Generation**: Large models may require chunking
- **Optimization**: Execution time depends on problem complexity

## Troubleshooting

### Common Issues
1. **Missing API Keys**: Set environment variables or use `--api-key`
2. **Invalid Patterns**: Check pattern detection output
3. **Code Generation Failures**: Review validation results
4. **Optimization Errors**: Check solver logs and constraints

### Debug Mode
Enable verbose output to see detailed step-by-step progress:
```python
workflow.run(verbose=True)
```

## Future Enhancements

- Support for additional nonlinear patterns
- Integration with other optimization solvers
- Advanced code optimization techniques
- Parallel processing for multiple problems
- Web-based interface for problem submission 