# Configuration for lu_factorization task - Optimized Gemini Flash 2.5
# Achieved 1.64x AlgoTune Score with these settings

# General settings
max_iterations: 100
checkpoint_interval: 10
log_level: "INFO"
random_seed: 42
diff_based_evolution: true  # Best for Gemini models
max_code_length: 20000  # Increased from 10000 for deeper exploration

# LLM Configuration
llm:
  api_base: "https://openrouter.ai/api/v1"
  models:
    - name: "google/gemini-2.5-flash"
      weight: 0.8
    - name: "google/gemini-2.5-pro"
      weight: 0.2
  
  temperature: 0.4  # Optimal (better than 0.2, 0.6, 0.8)
  max_tokens: 128000  # Increased from 16000 for much richer context
  timeout: 150
  retries: 3

# Prompt Configuration - Optimal settings
prompt:
  system_message: |
    SETTING:
    You're an autonomous programmer tasked with solving a specific problem. You are to use the commands defined below to accomplish this task. Every message you send incurs a cost—you will be informed of your usage and remaining budget by the system.
    You will be evaluated based on the best-performing piece of code you produce, even if the final code doesn't work or compile (as long as it worked at some point and achieved a score, you will be eligible).
    Apart from the default Python packages, you have access to the following additional packages:
     - cryptography
     - cvxpy
     - cython
     - dace
     - dask
     - diffrax
     - ecos
     - faiss-cpu
     - hdbscan
     - highspy
     - jax
     - networkx
     - numba
     - numpy
     - ortools
     - pandas
     - pot
     - psutil
     - pulp
     - pyomo
     - python-sat
     - pythran
     - scikit-learn
     - scipy
     - sympy
     - torch
    Your primary objective is to optimize the `solve` function to run as as fast as possible, while returning the optimal solution.
    You will receive better scores the quicker your solution runs, and you will be penalized for exceeding the time limit or returning non-optimal solutions.

    Below you find the description of the task you will have to solve. Read it carefully and understand what the problem is and what your solver should do.

    You are an expert programmer specializing in matrix_operations algorithms. Your task is to improve the lu_factorization algorithm implementation with baseline comparison.

    The problem description is:
    LUFactorization Task:

    Given a square matrix A, the task is to compute its LU factorization.
    The LU factorization decomposes A as:

        A = P · L · U

    where P is a permutation matrix, L is a lower triangular matrix with ones on the diagonal, and U is an upper triangular matrix.

    Focus on improving the solve method to correctly handle the input format and produce valid solutions efficiently. Your solution will be compared against the reference AlgoTune baseline implementation to measure speedup and correctness.

    PERFORMANCE OPTIMIZATION OPPORTUNITIES:
    You have access to high-performance libraries that can provide significant speedups:
    
    • **JAX** - JIT compilation for numerical computations
    
    • **Numba** - Alternative JIT compilation, often simpler to use
    
    • **scipy optimizations** - Direct BLAS/LAPACK access and specialized algorithms
      Many scipy functions have optimized implementations worth exploring
    
    • **Vectorization** - Look for opportunities to replace loops with array operations
    
    EXPLORATION STRATEGY:
    1. Profile to identify bottlenecks first
    2. Consider multiple optimization approaches for the same problem
    3. Try both library-specific optimizations and algorithmic improvements
    4. Test different numerical libraries to find the best fit
    
  num_top_programs: 5     # Increased from 3-5 for richer learning context
  num_diverse_programs: 5  # Increased from 2 for more diverse exploration
  include_artifacts: true  # +20.7% improvement

# Database Configuration
database:
  population_size: 1000
  archive_size: 100
  num_islands: 4
  
  # Selection parameters - Optimal ratios
  elite_selection_ratio: 0.1   # 10% elite
  exploration_ratio: 0.3       # 30% exploration
  exploitation_ratio: 0.6      # 60% exploitation
  
  # NO feature_dimensions - let it use defaults based on evaluator metrics
  feature_bins: 10
  
  # Migration parameters
  migration_interval: 20
  migration_rate: 0.1  # Better than 0.2

# Evaluator Configuration
evaluator:
  timeout: 200
  max_retries: 3
  
  # Cascade evaluation
  cascade_evaluation: true
  cascade_thresholds: [0.5, 0.8]
  
  # Parallel evaluations
  parallel_evaluations: 4

# AlgoTune task-specific configuration
algotune:
  num_trials: 5
  data_size: 25
  timeout: 300
  num_runs: 3
  warmup_runs: 1
