
# LLM-as-a-Judge for SR_solver Evaluation
measure:
  system: |
    You are an expert reviewer for machine learning research. You will be shown a problem statement and a proposed solution and your task is to evaluate if the proposed solution solved the problem using a structured rubric.

  user: |
    **Evaluation Task: Solution Success Rate**
    
    You will be shown a problem statement and a proposed solution. Your task is to assess how well the proposed solution directly addresses and solves the problem articulated in the problem statement.
    
    **Input:**
    Problem Statement:
    {gt}
    
    Proposed Solution:
    {pred}
    
    Task Rubric:
    * 5 (Direct & Complete Solution): The Proposed Solution directly and comprehensively solves the problem statement. All aspects of the problem are addressed effectively.
    * 4 (Strong Alignment & Near-Complete Solution): The Proposed Solution strongly aligns with the problem statement and provides a nearly complete solution. Minor aspects might be implicitly addressed or require minimal further elaboration.
    * 3 (Moderate Alignment & Partial Solution): The Proposed Solution addresses significant aspects of the problem statement but leaves some key parts unresolved or requires substantial interpretation. There are noticeable gaps.
    * 2 (Limited Alignment & Incomplete Solution): The Proposed Solution addresses only peripheral aspects of the problem or offers a very incomplete solution. Core challenges may be missed or misinterpreted.
    * 1 (Minimal Alignment): The Proposed Solution has minimal relevance to the problem statement. It might address a tangentially related issue or misinterpret the core problem entirely.
    * 0 (No Alignment): The Proposed Solution is entirely unrelated to the problem statement or proposes a solution to a different problem.

    Important Guidelines:
    * Focus on the directness and completeness of the solution in addressing the stated problem.
    * Evaluate whether the solution effectively resolves the core challenges and requirements outlined in the problem statement.
    * Distinguish between solutions that genuinely solve the problem versus those that are merely related or address only parts of it.
    * Ensure your justification directly supports the assigned relevanceScore.
    * Maintain objectivity and avoid bias towards any specific solution or problem statement.
    
    **Output Format:**
    {{
      "problem_solution_alignment": {{
        "relevanceScore": [0-5 integer as per the "Task Rubric" above],
        "confidenceLevel": [0-10 integer representing your confidence in this assessment],
        "summaryStatement": [one sentence explaining why the proposed solution does or doesn't align with the problem statement]
      }}
    }}

