{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "436db9d4",
   "metadata": {},
   "source": [
    "## GRPO Math500"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "226454e0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "import json\n",
    "import argparse\n",
    "import re\n",
    "import os\n",
    "from collections import defaultdict\n",
    "from typing import List\n",
    "\n",
    "data_name = \"grpo_math500\"\n",
    "\n",
    "ds_grpo = []\n",
    "for i in range(2, 6):\n",
    "    data_path = f\"/home/USER/PRM_filter/eval/eval_prm/data/{data_name}_{i}.jsonl\"\n",
    "    ds0 = load_dataset(\"json\", data_files=data_path, split=\"train\")\n",
    "    ds_grpo.append(ds0)\n",
    "\n",
    "print(len(ds_grpo))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "c23ebf37",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'problem': 'Five points $A$, $B$, $C$, $D$, and $O$ lie on a flat field.  $A$ is directly north of $O$, $B$ is directly west of $O$, $C$ is directly south of $O$, and $D$ is directly east of $O$. The  distance between $C$ and $D$ is 140 m.  A hot-air balloon is positioned in the air at $H$ directly above $O$. The balloon is held in place by four ropes $HA$, $HB$, $HC$, and $HD$.  Rope $HC$ has length 150 m and rope $HD$ has length 130 m. [asy]\\nsize(250);\\npair A, B, C, D, O, H, W, X, Y, Z;\\nO=(0,0);\\nA=(1,1);\\nD=(1.5,-.3);\\nB=(-1.5,.3);\\nC=(-1,-1);\\nH=(0,2.5);\\nW=(5/3)*(A+D);\\nX=(5/3)*(A+B);\\nY=(-1)*(W);\\nZ=(-1)*(X);\\ndraw(W--X--Y--Z--W);\\ndraw(A--C);\\ndraw(B--D);\\ndraw(O--H, linewidth(1));\\ndraw(A--H, dashed);\\ndraw(B--H, dashed);\\ndraw(C--H, dashed);\\ndraw(D--H, dashed);\\ndot(A);\\ndot(B);\\ndot(C);\\ndot(D);\\ndot(O);\\ndot(H);\\nlabel(\"A\", A, NE);\\nlabel(\"B\", B, SW);\\nlabel(\"C\", C, SE);\\nlabel(\"D\", D, NE);\\nlabel(\"O\", O, SE);\\nlabel(\"H\", H, NW);\\n[/asy]\\n\\nTo reduce the total length of rope used, rope $HC$ and rope $HD$ are to be replaced by a single rope $HP$ where $P$ is a point on the straight line between $C$ and $D$. (The balloon remains at the same position $H$ above $O$ as described above.)  Determine the greatest length of rope that can be saved. Let\\'s think step by step and output the final answer within \\\\boxed{}.',\n",
       " 'steps': 7,\n",
       " 'solution': \"<step>Given the problem, we need to find the greatest length of rope that can be saved by replacing ropes $HC$ and $HD$ with a single rope $HP$, where $P$ is a point on the straight line between $C$ and $D$. The distance between $C$ and $D$ is 140 m. The lengths of ropes $HC$ and $HD$ are 150 m and 130 m, respectively.</step>\\n<step>First, let's denote the coordinates of the points:\\n- $O = (0, 0)$\\n- $A = (0, 1)$ (since $A$ is directly north of $O$)\\n- $B = (-1.5, 0.3)$ (since $B$ is directly west of $O$)\\n- $C = (0, -1)$ (since $C$ is directly south of $O$)\\n- $D = (1.5, 0.3)$ (since $D$ is directly east of $O$)\\n- $H = (0, h)$ (since $H$ is directly above $O$)</step>\\n<step>The distance $HC$ is given by:\\n\\\\[ HC = \\\\sqrt{(0 - 0)^2 + (h - (-1))^2} = \\\\sqrt{(h + 1)^2} = |h + 1| = 150 \\\\]\\nSo, $h + 1 = 150$ or $h + 1 = -150$. Since $h$ is a height above $O$, we take the positive value:\\n\\\\[ h + 1 = 150 \\\\implies h = 149 \\\\]</step>\\n<step>The distance $HD$ is given by:\\n\\\\[ HD = \\\\sqrt{(0 - 1.5)^2 + (h - 0.3)^2} = \\\\sqrt{(1.5)^2 + (149 - 0.3)^2} = \\\\sqrt{2.25 + 148.7^2} = \\\\sqrt{2.25 + 22111.69} = \\\\sqrt{22113.94} = 130 \\\\]\\nSo, we need to check if this value of $h$ satisfies the equation:\\n\\\\[ \\\\sqrt{(1.5)^2 + (149 - 0.3)^2} = 130 \\\\]\\n\\\\[ \\\\sqrt{2.25 + 148.7^2} = 130 \\\\]\\n\\\\[ \\\\sqrt{2.25 + 22111.69} = 130 \\\\]\\n\\\\[ \\\\sqrt{22113.94} = 130 \\\\]\\nThis is approximately true, so $h = 149$ is correct.</step>\\n<step>Now, we need to find the point $P$ on the line segment $CD$ that minimizes the total length of rope $HP$. The coordinates of $P$ can be parameterized as:\\n\\\\[ P = (t \\\\cdot 1.5 + (1-t) \\\\cdot (-1.5), t \\\\cdot 0.3 + (1-t) \\\\cdot (-0.3)) = (3t - 1.5, 0.6t - 0.3) \\\\]\\nThe distance $HP$ is given by:\\n\\\\[ HP = \\\\sqrt{(0 - (3t - 1.5))^2 + (149 - (0.6t - 0.3))^2} = \\\\sqrt{(3t - 1.5)^2 + (150 - 0.6t)^2} \\\\]\\nTo minimize the total length of rope $HP$, we need to minimize the function:\\n\\\\[ f(t) = \\\\sqrt{(3t - 1.5)^2 + (150 - 0.6t)^2} \\\\]\\nWe can find the critical points by taking the derivative and setting it to zero:\\n\\\\[ f'(t) = \\\\frac{d}{dt} \\\\left( \\\\sqrt{(3t - 1.5)^2 + (150 - 0.6t)^2} \\\\right) = 0 \\\\]\\n\\\\[ f'(t) = \\\\frac{6(3t - 1.5) + 0.6(150 - 0.6t)}{\\\\sqrt{(3t - 1.5)^2 + (150 - 0.6t)^2}} = 0 \\\\]\\n\\\\[ 18t - 9 + 90 - 0.36t = 0 \\\\]\\n\\\\[ 17.64t + 81 = 0 \\\\]\\n\\\\[ 17.64t = -81 \\\\]\\n\\\\[ t = \\\\frac{-81}{17.64} \\\\approx -4.6 \\\\]\\nSince $t$ must be between 0 and 1, we need to check the endpoints $t=0$ and $t=1$:\\n- When $t=0$, $P = (-1.5, -0.3)$ and $HP = \\\\sqrt{(0 - (-1.5))^2 + (149 - (-0.3))^2} = \\\\sqrt{1.5^2 + 149.3^2} = \\\\sqrt{2.25 + 22290.49} = \\\\sqrt{22292.74} \\\\approx 149.3$\\n- When $t=1$, $P = (1.5, 0.3)$ and $HP = \\\\sqrt{(0 - 1.5)^2 + (149 - 0.3)^2} = \\\\sqrt{1.5^2 + 148.7^2} = \\\\sqrt{2.25 + 22111.69} = \\\\sqrt{22113.94} \\\\approx 148.7$</step>\\n<step>The minimum length of $HP$ is approximately 148.7 m. The total length of ropes $HC$ and $HD$ is $150 + 130 = 280$ m. The length of rope $HP$ is approximately 148.7 m. So, the greatest length of rope that can be saved is:\\n\\\\[ 280 - 148.7 = 131.3 \\\\]</step>\\n<step>Therefore, the greatest length of rope that can be saved is \\\\(\\\\boxed{131}\\\\).</step>\",\n",
       " 'score': 0.0,\n",
       " 'judge': ['<step>Step 1 Analysis: This step correctly identifies the problem, stating that we need to find the greatest length of rope saved by replacing ropes HC and HD with a single rope HP. The step accurately notes the given information: the distance between C and D is 140 m, and the lengths of ropes HC and HD are 150 m and 130 m respectively. This is a proper setup for the problem. \\\\boxed{correct}</step>\\n\\n<step>Step 2 Analysis: This step attempts to establish a coordinate system, but there are errors. While O at (0,0) is reasonable, the other coordinates don\\'t match the problem description. The problem states A is directly north of O, so A should be (0,y) where y>0, not (0,1). Similarly, C should be (0,-y) for some y>0. Also, B and D should be at (-x,0) and (x,0) respectively for some x>0, not at the coordinates given. These coordinates are inconsistent with the \"directly north/south/east/west\" descriptions in the problem. \\\\boxed{incorrect}</step>\\n\\n<step>Step 3 Analysis: This step calculates the height h of point H using the given length of HC (150 m). However, the calculation is based on incorrect coordinates from Step 2. If C is directly south of O, then C should be at (0,-y) for some y>0, not at (0,-1). The calculation assumes C is at (0,-1) and derives h=149, which is not justified by the given information. \\\\boxed{incorrect}</step>\\n\\n<step>Step 4 Analysis: This step attempts to verify the value of h=149 using the given length of HD (130 m). However, it\\'s based on incorrect coordinates and contains calculation errors. The distance formula is applied incorrectly, and the verification doesn\\'t properly check if the calculated value matches the given 130 m. The coordinates used for D are also inconsistent with the problem description. \\\\boxed{incorrect}</step>\\n\\n<step>Step 5 Analysis: This step attempts to find the optimal point P on line CD that minimizes HP. However, it\\'s based on incorrect coordinates and contains mathematical errors. The parameterization of the line CD is incorrect, and the derivative calculation contains errors. The critical point t≈-4.6 is outside the valid range [0,1], but the step correctly checks the endpoints. However, the calculations at these endpoints are based on incorrect coordinates. \\\\boxed{incorrect}</step>\\n\\n<step>Step 6 Analysis: This step calculates the rope length saved as 280 - 148.7 = 131.3 m. While the arithmetic is correct, the value 148.7 m for HP is based on incorrect coordinates and calculations from previous steps. Therefore, the conclusion that 131.3 m of rope can be saved is not reliably established. \\\\boxed{incorrect}</step>\\n\\n<step>Step 7 Analysis: The final answer of 131 m (rounded from 131.3 m) is presented. However, since this answer is derived from incorrect coordinates and flawed calculations in the previous steps, it cannot be considered correct. The solution approach had the right idea of minimizing HP where P is on line CD, but the execution contained significant errors in the coordinate system and subsequent calculations. \\\\boxed{incorrect}</step>',\n",
       "  '<step>Step 1 Analysis: The assistant correctly states the given equations from the problem: xy = 24, xz = 48, and yz = 72. This is simply restating the problem conditions accurately without any calculations yet. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant multiplies the three given equations together to get (xy)(xz)(yz) = 24 · 48 · 72. This is a valid approach. When we expand the left side, we get (xyz)² because each variable appears exactly twice: (xy)(xz)(yz) = x²y²z². This step correctly establishes that (xyz)² = 24 · 48 · 72, which will help determine the values of x, y, and z. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant calculates 24 · 48 · 72 = 82,944, which is correct. Then they find xyz = √82,944 = 288. Using this value, they correctly determine:\\nx = xyz/yz = 288/72 = 4\\ny = xyz/xz = 288/48 = 6\\nz = xyz/xy = 288/24 = 12\\n\\nThese values can be verified by substituting back into the original equations:\\nxy = 4 × 6 = 24 ✓\\nxz = 4 × 12 = 48 ✓\\nyz = 6 × 12 = 72 ✓\\n\\nFinally, they correctly calculate x + y + z = 4 + 6 + 12 = 22. The final answer of 22 is correct. \\\\boxed{CORRECT}</step>',\n",
       "  \"<step>Step 1 Analysis: In this step, the assistant correctly identifies the approach to solve the minimization problem. When dealing with constrained optimization problems like this one (minimizing a function subject to a constraint), using Lagrange multipliers or applying inequalities like Cauchy-Schwarz or Titu's lemma are valid approaches. The assistant has properly set up the problem by acknowledging that we need to minimize $\\\\frac{1}{a} + \\\\frac{1}{b} + \\\\frac{4}{c} + \\\\frac{16}{d}$ subject to $a + b + c + d = 1$ where all variables are positive. This is a correct starting point for the solution. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly states Titu's lemma (which is a form of the Cauchy-Schwarz inequality). The lemma states that for positive real numbers $a_i$ and $b_i$, we have $\\\\frac{a_1^2}{b_1} + \\\\frac{a_2^2}{b_2} + \\\\cdots + \\\\frac{a_n^2}{b_n} \\\\geq \\\\frac{(a_1 + a_2 + \\\\cdots + a_n)^2}{b_1 + b_2 + \\\\cdots + b_n}$, with equality when $\\\\frac{a_1}{b_1} = \\\\frac{a_2}{b_2} = \\\\cdots = \\\\frac{a_n}{b_n}$. This is mathematically accurate and will be useful for finding the minimum value of the given expression. The equality condition is also correctly stated, which will be crucial for determining when the minimum occurs. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant applies Titu's lemma by setting $a_1=a_2=1$, $a_3=2$, $a_4=4$ and $b_1=a$, $b_2=b$, $b_3=c$, $b_4=d$. This gives $\\\\frac{1^2}{a} + \\\\frac{1^2}{b} + \\\\frac{2^2}{c} + \\\\frac{4^2}{d} \\\\geq \\\\frac{(1+1+2+4)^2}{a+b+c+d} = \\\\frac{8^2}{1} = 64$. \\n\\nThe equality condition requires $\\\\frac{1}{a} = \\\\frac{1}{b} = \\\\frac{2}{c} = \\\\frac{4}{d} = k$ for some constant $k$. Solving for the variables: $a = \\\\frac{1}{k}$, $b = \\\\frac{1}{k}$, $c = \\\\frac{2}{k}$, $d = \\\\frac{4}{k}$.\\n\\nUsing the constraint $a + b + c + d = 1$, we get $\\\\frac{1}{k} + \\\\frac{1}{k} + \\\\frac{2}{k} + \\\\frac{4}{k} = 1$, which simplifies to $\\\\frac{8}{k} = 1$, giving $k = 8$.\\n\\nTherefore, $a = \\\\frac{1}{8}$, $b = \\\\frac{1}{8}$, $c = \\\\frac{1}{4}$, $d = \\\\frac{1}{2}$.\\n\\nSubstituting these values into the original expression:\\n$\\\\frac{1}{a} + \\\\frac{1}{b} + \\\\frac{4}{c} + \\\\frac{16}{d} = 8 + 8 + 16 + 32 = 64$\\n\\nThe minimum value is indeed 64, and the solution is mathematically sound. \\\\boxed{CORRECT}</step>\",\n",
       "  '<step>Step 1 Analysis: The assistant correctly identifies the need to determine the exchange rate between lire and dollars based on the given information that 1200 lire equals $1.50. This is the appropriate first step to solve the problem since we need the exchange rate to convert between currencies. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The exchange rate calculation is mathematically sound. The assistant divides $1.50 by 1200 lire to find the dollars per lire exchange rate. This gives $\\\\frac{1.50}{1200}$ dollars per lire, which is the correct approach for determining the conversion factor. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant correctly explains that to find the dollar equivalent of 1,000,000 lire, we need to multiply the amount in lire by the exchange rate (dollars per lire) that was calculated in the previous step. This demonstrates proper understanding of currency conversion. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 4 Analysis: The formula shown is correct: 1,000,000 lire × $\\\\frac{1.50}{1200}$ dollars per lire. This properly sets up the mathematical operation needed to convert 1,000,000 lire to dollars using the previously calculated exchange rate. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant indicates they will calculate the result step by step, which is a good approach for clarity. This statement serves as a transition to the actual calculation. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: The calculation is performed correctly:\\n$\\\\frac{1,000,000 × 1.50}{1200} = \\\\frac{1,500,000}{1200} = 1250$ dollars\\nThe assistant multiplied 1,000,000 by 1.50 to get 1,500,000, then divided by 1200 to get 1250, which is mathematically accurate. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant correctly indicates that they are about to present the final answer, which is a proper conclusion to the solution process. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 8 Analysis: The final answer of $\\\\boxed{1250}$ is correct. The assistant has properly calculated the dollar equivalent of 1,000,000 lire using the exchange rate derived from the given information. The overall solution approach was logical and the calculations were performed accurately. \\\\boxed{CORRECT}</step>']}"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pro_data_list_grpo = []\n",
    "\n",
    "for i in range(len(ds_grpo[0])):\n",
    "    problem = ds_grpo[0][i][\"source\"][\"problem\"]\n",
    "    steps = ds_grpo[0][i][\"source\"][\"steps\"]\n",
    "    solution = ds_grpo[0][i][\"source\"][\"solution\"]\n",
    "    score = ds_grpo[0][i][\"source\"][\"score\"]\n",
    "    judge = [ds_grpo[j][i][\"synthetic\"][\"inference\"] for j in range(len(ds_grpo))]\n",
    "\n",
    "    pro_data_list_grpo.append({\n",
    "        \"problem\": problem,\n",
    "        \"steps\": steps,\n",
    "        \"solution\": solution,\n",
    "        \"score\": score,\n",
    "        \"judge\": judge\n",
    "    })\n",
    "\n",
    "pro_data_list_grpo[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "6877fbd9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "500\n",
      "[[1, 0, 0, 0, 0, 0, 0], [1, 1, 1], [1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1]]\n"
     ]
    }
   ],
   "source": [
    "def extract_boxed_labels(text: str) -> List[int]:\n",
    "    \"\"\"\n",
    "    Finds all instances of \\\\boxed{CORRECT} or \\\\boxed{INCORRECT} in a string,\n",
    "    case-insensitively, and converts them to a list of 1s and 0s respectively.\n",
    "    \"\"\"\n",
    "    if not isinstance(text, str):\n",
    "        return []\n",
    "    text = re.sub(r'\\\\boxed{(PARTIAL|PARTIALLY CORRECT|Incomplete|CANNOT VERIFY)}', r'\\\\boxed{INCORRECT}', text)\n",
    "    text = re.sub(r'\\\\boxed{(\\\\checkmark|\\u2713)}', r'\\\\boxed{CORRECT}', text)\n",
    "    \n",
    "    matches = re.findall(r'\\\\boxed\\{\\s*(CORRECT|INCORRECT)\\s*\\}', text, flags=re.IGNORECASE)\n",
    "    judges = [1 if m.upper() == \"CORRECT\" else 0 for m in matches]\n",
    "    return judges\n",
    "\n",
    "judges = []\n",
    "for i in range(len(pro_data_list_grpo)):\n",
    "    judge = []\n",
    "    for j in range(len(pro_data_list_grpo[i][\"judge\"])):\n",
    "        judge_txt = pro_data_list_grpo[i][\"judge\"][j]\n",
    "        judge_score = extract_boxed_labels(judge_txt)\n",
    "        if judge_score==[]:\n",
    "            continue\n",
    "        judge.append(judge_score)\n",
    "    judges.append(judge)\n",
    "\n",
    "print(len(judges))\n",
    "print(judges[0])\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "5782d02d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "68\n"
     ]
    }
   ],
   "source": [
    "flag=0\n",
    "for i in range(len(pro_data_list_grpo)):\n",
    "    steps = pro_data_list_grpo[i][\"steps\"]\n",
    "    correct_judge = None\n",
    "    for j in range(len(judges[i])):\n",
    "        if len(judges[i][j])==steps:\n",
    "            correct_judge = judges[i][j]\n",
    "            break\n",
    "    if correct_judge is None:\n",
    "        flag+=1\n",
    "        correct_judge = [0]\n",
    "\n",
    "    pro_data_list_grpo[i][\"correct_judge\"] = correct_judge\n",
    "\n",
    "\n",
    "print(flag)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "869622cc",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "acc_judge = []\n",
    "acc_min = []\n",
    "acc_pos = []\n",
    "acc_neg = []\n",
    "for i in range(len(pro_data_list_grpo)):\n",
    "    acc_judge.append(np.mean(pro_data_list_grpo[i]['correct_judge']))\n",
    "    if pro_data_list_grpo[i]['score'] == 1:\n",
    "        acc_min.append(np.min(pro_data_list_grpo[i]['correct_judge']))\n",
    "        acc_pos.append(np.mean(pro_data_list_grpo[i]['correct_judge']))\n",
    "    else:\n",
    "        acc_neg.append(np.mean(pro_data_list_grpo[i]['correct_judge']))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "c34a1516",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "acc: 1.0\n",
      "mean prm: 0.7955797783152622\n",
      "pos min prm: 0.8992248062015504\n",
      "pos: 0.9216609437352122\n",
      "neg: 0.3637796808150791\n"
     ]
    }
   ],
   "source": [
    "print(\"acc:\", len(acc_judge)/len(pro_data_list_grpo))\n",
    "print(\"mean prm:\", np.mean(acc_judge))\n",
    "print(\"pos min prm:\", np.mean(acc_min))\n",
    "print(\"pos:\", np.mean(acc_pos))\n",
    "print(\"neg:\", np.mean(acc_neg))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "f9a29552",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mean step num: 7.308\n"
     ]
    }
   ],
   "source": [
    "## mean step number\n",
    "step_num_grpo = [pro_data_list_grpo[i]['steps'] for i in range(len(pro_data_list_grpo))]\n",
    "print(\"mean step num:\", np.mean(step_num_grpo))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "ec5e58a6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'problem': \"Define\\n\\\\[p = \\\\sum_{k = 1}^\\\\infty \\\\frac{1}{k^2} \\\\quad \\\\text{and} \\\\quad q = \\\\sum_{k = 1}^\\\\infty \\\\frac{1}{k^3}.\\\\]Find a way to write\\n\\\\[\\\\sum_{j = 1}^\\\\infty \\\\sum_{k = 1}^\\\\infty \\\\frac{1}{(j + k)^3}\\\\]in terms of $p$ and $q.$ Let's think step by step and output the final answer within \\\\boxed{}.\",\n",
       " 'steps': 2,\n",
       " 'solution': \"<step>We need to find the value of the double sum \\\\(\\\\sum_{j=1}^\\\\infty \\\\sum_{k=1}^\\\\infty \\\\frac{1}{(j+k)^3}\\\\) in terms of \\\\(p\\\\) and \\\\(q\\\\), where \\\\(p = \\\\sum_{k=1}^\\\\infty \\\\frac{1}{k^2}\\\\) and \\\\(q = \\\\sum_{k=1}^\\\\infty \\\\frac{1}{k^3}\\\\).</step>\\n<step>First, let's rewrite the double sum in a more convenient form. We can change the indices of the sum to make it easier to handle. Let \\\\(n = j + k\\\\). Then for each fixed \\\\(n\\\\), \\\\(j\\\\) can range from 1 to \\\\(n-1\\\\) and \\\\(k = n - j\\\\). So the double sum can be rewritten as:\\n\\\\[\\n\\\\sum_{n=2}^\\\\infty \\\\sum_{j=1}^{n-1} \\\\frac{1}{n^3} = \\\\sum_{n=2}^\\\\infty \\\\frac{n-1}{n^3} = \\\\sum_{n=2}^\\\\infty \\\\left(\\\\frac{1}{n^2} - \\\\frac{1}{n^3}\\\\right).\\n\\\\]\\nWe can split this sum into two separate sums:\\n\\\\[\\n\\\\sum_{n=2}^\\\\infty \\\\frac{1}{n^2} - \\\\sum_{n=2}^\\\\infty \\\\frac{1}{n^3}.\\n\\\\]\\nThe first sum can be written as:\\n\\\\[\\n\\\\sum_{n=2}^\\\\infty \\\\frac{1}{n^2} = \\\\left(\\\\sum_{n=1}^\\\\infty \\\\frac{1}{n^2}\\\\right) - 1 = p - 1.\\n\\\\]\\nThe second sum can be written as:\\n\\\\[\\n\\\\sum_{n=2}^\\\\infty \\\\frac{1}{n^3} = \\\\left(\\\\sum_{n=1}^\\\\infty \\\\frac{1}{n^3}\\\\right) - 1 = q - 1.\\n\\\\]\\nSo the double sum becomes:\\n\\\\[\\n(p - 1) - (q - 1) = p - q.\\n\\\\]\\nThus, the value of the double sum \\\\(\\\\sum_{j=1}^\\\\infty \\\\sum_{k=1}^\\\\infty \\\\frac{1}{(j+k)^3}\\\\) is \\\\(\\\\boxed{p - q}\\\\).</step>\",\n",
       " 'score': 1.0,\n",
       " 'judge': [\"<step>Step 1 Analysis: The assistant correctly sets up the problem by identifying the double sum that needs to be evaluated and the definitions of p and q. This is a straightforward restatement of the problem and doesn't contain any mathematical errors. \\\\boxed{correct}</step>\\n\\n<step>Step 2 Analysis: The assistant's approach to reindex the double sum is valid. By setting n = j + k, they correctly observe that for each fixed n, j ranges from 1 to n-1 with k = n-j. This gives us n-1 terms for each n, each with value 1/n³.\\n\\nThe sum becomes:\\n\\\\[\\\\sum_{n=2}^{\\\\infty} \\\\sum_{j=1}^{n-1} \\\\frac{1}{n^3} = \\\\sum_{n=2}^{\\\\infty} \\\\frac{n-1}{n^3} = \\\\sum_{n=2}^{\\\\infty} \\\\left(\\\\frac{1}{n^2} - \\\\frac{1}{n^3}\\\\right)\\\\]\\n\\nThis algebraic manipulation is correct. Then they split this into:\\n\\\\[\\\\sum_{n=2}^{\\\\infty} \\\\frac{1}{n^2} - \\\\sum_{n=2}^{\\\\infty} \\\\frac{1}{n^3}\\\\]\\n\\nThe first sum equals p - 1 since p includes the n=1 term.\\nThe second sum equals q - 1 since q includes the n=1 term.\\n\\nTherefore, the double sum equals (p - 1) - (q - 1) = p - q, which is the correct final answer. \\\\boxed{correct}</step>\",\n",
       "  '<step>Step 1 Analysis: The assistant correctly outlines the general approach to convert a binary number to octal. This is a valid introduction to the solution process. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly identifies the two main steps needed for the conversion: first converting binary to decimal, and then converting decimal to octal. This is a valid approach, though there are more direct methods (grouping binary digits in sets of 3). \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant correctly explains the method for converting binary to decimal by calculating the sum of powers of 2 corresponding to the positions of 1s in the binary number. This is the standard method for binary to decimal conversion. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 4 Analysis: The assistant correctly expands the binary number 10101001110₂ into its decimal representation using powers of 2. Each digit is properly multiplied by its corresponding power of 2, from 2^10 down to 2^0. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant correctly calculates the value of each power of 2: 2^10=1024, 2^9=512, 2^8=256, 2^7=128, 2^6=64, 2^5=32, 2^4=16, 2^3=8, 2^2=4, 2^1=2, 2^0=1. These values are then multiplied by their respective binary digits. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: The assistant correctly simplifies the expression by keeping only the terms where the binary digit is 1 and eliminating terms where the binary digit is 0. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant correctly adds all the terms: 1024 + 256 + 64 + 8 + 4 + 2 = 1358, not 1354 as stated. This is an arithmetic error. \\\\boxed{INCORRECT}</step>\\n\\n<step>Step 8 Analysis: The assistant correctly explains the method for converting decimal to octal by repeatedly dividing by 8 and tracking remainders. However, since the decimal value is incorrect, this step is building on an error. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 9 Analysis: The assistant performs division by 8 starting with 1354 (which is already incorrect). The correct calculations should start with 1358. The divisions shown (1354÷8=169 remainder 2, etc.) are mathematically correct for the value 1354, but this is not the correct decimal equivalent of the binary number. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 10 Analysis: The assistant correctly reads the remainders from bottom to top to form the octal number 2512₈. This is the correct octal representation of 1354, but not of the original binary number. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 11 Analysis: The assistant states that 10101001110₂ equals 2512₈, which is incorrect due to the earlier arithmetic error. The correct decimal value is 1358, which converts to 2516₈ in octal. \\\\boxed{INCORRECT}</step>\\n\\n<step>Step 12 Analysis: The final answer provided is \\\\boxed{2512}, which is incorrect. The correct conversion of binary 10101001110₂ to octal is 2516₈. The error stemmed from the incorrect addition in step 7. \\\\boxed{INCORRECT}</step>',\n",
       "  '<step>Step 1 Analysis: The assistant correctly identifies the problem setup. The room dimensions are 12\\' × 10\\' × 8\\'. The gecko starts on a side wall (10\\' × 8\\') at a position that\\'s 1 foot from the ceiling (so 7 feet from the floor) and 1 foot from the back wall. The fly is on the opposite side wall at a position that\\'s 1 foot from the floor and 1 foot from the front wall. The coordinates provided (1, 1, 7) for the gecko and (11, 1, 1) for the fly are not standard 3D coordinates but rather represent positions relative to the walls, which is a bit confusing but workable. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly proposes using the \"unfolding\" technique to solve this problem. This is a standard approach for finding shortest paths on polyhedra, where we transform a 3D path-finding problem into a 2D straight-line distance problem. The unfolding preserves distances along the surface, allowing us to find the shortest path by drawing a straight line between the points on the unfolded net. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant mentions unfolding the room into a rectangle that combines the length and width. However, this description is somewhat vague and doesn\\'t fully explain how the unfolding works. The room has six faces (ceiling, floor, and four walls), and the proper unfolding needs to account for how these connect. The statement about the unfolded plane being the sum of length and width is incomplete. \\\\boxed{INCORRECT}</step>\\n\\n<step>Step 4 Analysis: This step is redundant and essentially repeats the content from step 3 without adding new information. It mentions unfolding the room so the gecko walks along the ceiling and walls, but doesn\\'t provide specific details about how this unfolding is performed. The repetition suggests confusion in the solution approach. \\\\boxed{INCORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant now provides more concrete information about the unfolding. They state that the unfolded rectangle would be 12 feet long and 18 feet wide (10 + 8 + 10). This represents unfolding the two side walls (each 10\\' wide) and the ceiling (8\\' wide) into a single plane. The coordinates are simplified to 2D as (1, 7) for the gecko and (11, 1) for the fly. This is a valid approach since the shortest path will likely cross the ceiling. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: The assistant correctly identifies that the distance between the points (1, 7) and (11, 1) on the unfolded plane forms a right triangle with legs of length 10 (horizontal distance) and 6 (vertical distance). The horizontal distance is 11 - 1 = 10 feet, and the vertical distance is 7 - 1 = 6 feet. This is the correct application of the unfolding technique. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant correctly applies the Pythagorean theorem to calculate the distance: √(10² + 6²) = √(100 + 36) = √136 = 2√34. The simplification to 2√34 is mathematically correct since √136 = √(4·34) = 2√34. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 8 Analysis: The final answer of 2√34 feet is correct based on the unfolding approach used. However, I should note that the solution assumes a particular path (across the ceiling) is shortest without explicitly verifying that other possible paths (e.g., across the floor or around the walls) would be longer. In this case, the ceiling path is indeed shortest, so the final answer of 2√34 is correct. \\\\boxed{CORRECT}</step>',\n",
       "  '<step>Step 1 Analysis: The assistant correctly introduces the problem, which is to simplify the expression $(-k + 4) + (-2 + 3k)$ by combining like terms. This is an appropriate approach for simplifying algebraic expressions. \\\\boxed{\\\\text{CORRECT}}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly distributes the parentheses in the expression $(-k + 4) + (-2 + 3k)$ to get $-k + 4 - 2 + 3k$. This step follows the algebraic rule that when adding expressions in parentheses, we can simply write out all terms with their respective signs. \\\\boxed{\\\\text{CORRECT}}</step>\\n\\n<step>Step 3 Analysis: The assistant correctly combines like terms by grouping the terms with variable $k$ and the constant terms separately: $(-k + 3k) + (4 - 2) = 2k + 2$. The coefficient of $k$ becomes $-1 + 3 = 2$, and the constant terms sum to $4 - 2 = 2$. \\\\boxed{\\\\text{CORRECT}}</step>\\n\\n<step>Step 4 Analysis: The assistant correctly states that the simplified expression is $2k + 2$, which follows from the previous step where like terms were combined. \\\\boxed{\\\\text{CORRECT}}</step>\\n\\n<step>Step 5 Analysis: The assistant verifies the result using SymPy (a Python library for symbolic mathematics), which confirms that $(-k + 4) + (-2 + 3k)$ simplifies to $2k + 2$. This computational verification is a good practice and the result matches the manual calculation. \\\\boxed{\\\\text{CORRECT}}</step>\\n\\n<step>Step 6 Analysis: The assistant correctly presents the final answer as $\\\\boxed{2k + 2}$, which is the simplified form of the original expression $(-k + 4) + (-2 + 3k)$. The overall solution process is methodical and the final answer is accurate. \\\\boxed{\\\\text{CORRECT}}</step>'],\n",
       " 'correct_judge': [1, 1]}"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pro_data_list_grpo[167]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99f199f0",
   "metadata": {},
   "source": [
    "## Qwen PRM Math500"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2ac99333",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "import json\n",
    "import argparse\n",
    "import re\n",
    "import os\n",
    "from collections import defaultdict\n",
    "from typing import List\n",
    "\n",
    "data_name = \"prm_math500\"\n",
    "\n",
    "ds = []\n",
    "for i in range(2, 6):\n",
    "    data_path = f\"/home/USER/PRM_filter/eval/eval_prm/data/{data_name}_{i}.jsonl\"\n",
    "    ds0 = load_dataset(\"json\", data_files=data_path, split=\"train\")\n",
    "    ds.append(ds0)\n",
    "\n",
    "print(len(ds))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "8c69ced3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'problem': \"Let $a$ and $b$ be positive real numbers such that $ab^2 = 5.$  Find the minimum value of $a^4 + b^8.$ Let's think step by step and output the final answer within \\\\boxed{}.\",\n",
       " 'steps': 8,\n",
       " 'solution': \"<step>We need to find the minimum value of \\\\(a^4 + b^8\\\\) given that \\\\(ab^2 = 5\\\\).</step>\\n<step>First, we express \\\\(a\\\\) in terms of \\\\(b\\\\) using the constraint \\\\(ab^2 = 5\\\\):\\n\\\\[a = \\\\frac{5}{b^2}.\\\\]</step>\\n<step>Next, we substitute this expression for \\\\(a\\\\) into the expression \\\\(a^4 + b^8\\\\):\\n\\\\[a^4 + b^8 = \\\\left(\\\\frac{5}{b^2}\\\\right)^4 + b^8 = \\\\frac{625}{b^8} + b^8.\\\\]</step>\\n<step>Let \\\\(x = b^8\\\\). Then the expression becomes:\\n\\\\[f(x) = \\\\frac{625}{x} + x.\\\\]</step>\\n<step>To find the minimum value of \\\\(f(x)\\\\), we take the derivative and set it to zero:\\n\\\\[f'(x) = -\\\\frac{625}{x^2} + 1.\\\\]\\nSetting \\\\(f'(x) = 0\\\\):\\n\\\\[1 - \\\\frac{625}{x^2} = 0 \\\\implies x^2 = 625 \\\\implies x = 25 \\\\quad (\\\\text{since } x = b^8 > 0).\\\\]</step>\\n<step>Now we substitute \\\\(x = 25\\\\) back into the expression \\\\(f(x)\\\\):\\n\\\\[f(25) = \\\\frac{625}{25} + 25 = 25 + 25 = 50.\\\\]</step>\\n<step>To confirm that this is a minimum, we can check the second derivative:\\n\\\\[f''(x) = \\\\frac{1250}{x^3}.\\\\]\\nSince \\\\(f''(25) = \\\\frac{1250}{25^3} > 0\\\\), the function has a local minimum at \\\\(x = 25\\\\).</step>\\n<step>Therefore, the minimum value of \\\\(a^4 + b^8\\\\) is \\\\(\\\\boxed{50}\\\\).</step>\",\n",
       " 'score': 1.0,\n",
       " 'judge': [\"<step>Step 1 Analysis: The assistant correctly identifies the problem, which is to find the minimum value of a^4 + b^8 given the constraint ab^2 = 5. This is a clear and accurate statement of the optimization problem. \\\\boxed{correct}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly rearranges the constraint equation ab^2 = 5 to express a in terms of b, resulting in a = 5/b^2. This algebraic manipulation is valid and will allow substitution into the target expression. \\\\boxed{correct}</step>\\n\\n<step>Step 3 Analysis: The assistant correctly substitutes a = 5/b^2 into the expression a^4 + b^8:\\na^4 + b^8 = (5/b^2)^4 + b^8 = 625/b^8 + b^8\\nThe exponent calculation (5/b^2)^4 = 5^4/b^8 = 625/b^8 is mathematically correct. \\\\boxed{correct}</step>\\n\\n<step>Step 4 Analysis: The assistant makes a valid substitution by letting x = b^8, which simplifies the expression to f(x) = 625/x + x. This is a good approach as it reduces the complexity of the function to be minimized. \\\\boxed{correct}</step>\\n\\n<step>Step 5 Analysis: The assistant correctly calculates the derivative of f(x) = 625/x + x as f'(x) = -625/x^2 + 1. Setting this equal to zero and solving for x yields x^2 = 625, thus x = 25 (since x must be positive as it represents b^8). The calculus work here is correct. \\\\boxed{correct}</step>\\n\\n<step>Step 6 Analysis: The assistant correctly evaluates f(25) = 625/25 + 25 = 25 + 25 = 50. This calculation gives us the value of the original expression a^4 + b^8 at the critical point. \\\\boxed{correct}</step>\\n\\n<step>Step 7 Analysis: The assistant verifies that this critical point is indeed a minimum by calculating the second derivative f''(x) = 1250/x^3, which is positive when x = 25. This confirms that x = 25 corresponds to a local minimum of the function. \\\\boxed{correct}</step>\\n\\n<step>Step 8 Analysis: The assistant correctly concludes that the minimum value of a^4 + b^8 is 50. The entire solution process is mathematically sound: they identified the constraint, expressed one variable in terms of the other, substituted into the target expression, used calculus to find the critical point, verified it was a minimum, and calculated the minimum value. The final answer of 50 is correct. \\\\boxed{correct}</step>\",\n",
       "  \"<step>Step 1 Analysis: The assistant correctly interprets the problem, identifying that we need to find a two-digit integer whose cube equals 912,673, and then calculate the sum of its digits. This is a clear and accurate understanding of the problem statement. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant proposes a logical approach to solve the problem by suggesting we find the cube root of 912,673 to determine the original two-digit integer. This is indeed the most direct method to solve this problem. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant estimates the cube root of 912,673 to be approximately 97. This is a reasonable approximation. The cube root of 912,673 is indeed close to 97, so this is a good starting point for verification. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 4 Analysis: The assistant sets up the calculation for 97³ = 97 × 97 × 97. While they don't complete the calculation in this step, they correctly express what needs to be calculated to verify if 97 is the answer. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant uses Python to calculate 97³ and confirms it equals 912,673. I can independently verify: 97³ = 97 × 97 × 97 = 9,409 × 97 = 912,673. This confirms that the two-digit integer is indeed 97. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: Having determined that the two-digit integer is 97, the assistant correctly identifies A = 9 and B = 7, and calculates their sum as A + B = 9 + 7 = 16. This arithmetic is correct. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant concludes that the answer is 16, which is correct. The two-digit number whose cube is 912,673 is 97, and the sum of its digits is 9 + 7 = 16. The overall solution process is logical and the final answer is accurate. \\\\boxed{CORRECT}</step>\",\n",
       "  \"<step>Step 1 Analysis: The assistant correctly interprets the problem, identifying that we need to find a two-digit integer whose cube equals 912,673, and then calculate the sum of its digits. This is a clear and accurate understanding of the problem statement. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant proposes a logical approach to solve the problem by suggesting we find the cube root of 912,673 to determine the original two-digit integer. This is indeed the most direct method to solve this problem. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant estimates the cube root of 912,673 to be approximately 97. This is a reasonable approximation. The cube root of 912,673 is indeed close to 97, so this is a good starting point for verification. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 4 Analysis: The assistant sets up the calculation for 97³ = 97 × 97 × 97. While they don't complete the calculation in this step, they correctly express what needs to be calculated to verify if 97 is the answer. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant uses Python to calculate 97³ and confirms it equals 912,673. I can independently verify: 97³ = 97 × 97 × 97 = 9,409 × 97 = 912,673. This confirms that the two-digit integer is indeed 97. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: Having determined that the two-digit integer is 97, the assistant correctly identifies A = 9 and B = 7, and calculates their sum as A + B = 9 + 7 = 16. This arithmetic is correct. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant concludes that the answer is 16, which is correct. The two-digit number whose cube is 912,673 is 97, and the sum of its digits is 9 + 7 = 16. The overall solution process is logical and the final answer is accurate. \\\\boxed{CORRECT}</step>\",\n",
       "  \"<step>Step 1 Analysis: The assistant correctly identifies the problem, which is to find the value of x that minimizes z after Brahmagupta has chosen y optimally. This is a min-max problem where Archimedes chooses x first (trying to minimize z) and then Brahmagupta chooses y (trying to maximize z). The approach of working backward is appropriate for this type of problem. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 2 Analysis: The assistant correctly restates the function z = (2x - y)^2 - 2y^2 - 3y. This is just a restatement of the original function from the problem statement, preparing for the subsequent analysis. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 3 Analysis: The assistant correctly expands the squared term (2x - y)^2 = 4x^2 - 4xy + y^2. Then combines like terms with the other parts of the equation: -2y^2 - 3y to get 4x^2 - 4xy - y^2 - 3y. This algebraic expansion is performed correctly. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 4 Analysis: To find the value of y that maximizes z for a given x, the assistant correctly takes the partial derivative of z with respect to y. From z = 4x^2 - 4xy - y^2 - 3y, we get ∂z/∂y = -4x - 2y - 3. This differentiation is performed correctly. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 5 Analysis: The assistant correctly sets the partial derivative equal to zero and solves for y:\\n-4x - 2y - 3 = 0\\n-2y = 4x + 3\\ny = -2x - 3/2\\nThis calculation is mathematically sound and gives the optimal value of y for Brahmagupta. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 6 Analysis: The assistant correctly substitutes the optimal value y = -2x - 3/2 back into the original function z = 4x^2 - 4xy - y^2 - 3y. This is the appropriate next step to express z solely in terms of x. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 7 Analysis: The assistant simplifies the expression after substitution:\\nz = 4x^2 - 4x(-2x - 3/2) - (-2x - 3/2)^2 - 3(-2x - 3/2)\\n= 4x^2 + 8x^2 + 6x - (4x^2 + 6x + 9/4) + 6x + 9/2\\n= 4x^2 + 8x^2 + 6x - 4x^2 - 6x - 9/4 + 6x + 9/2\\n= 8x^2 + 6x + 9/4\\n\\nThis simplification is correct. The final expression represents z in terms of x after Brahmagupta has chosen y optimally. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 8 Analysis: To find the value of x that minimizes z = 8x^2 + 6x + 9/4, the assistant correctly takes the derivative with respect to x:\\ndz/dx = 16x + 6\\nThis differentiation is performed correctly. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 9 Analysis: The assistant correctly sets the derivative equal to zero and solves for x:\\n16x + 6 = 0\\n16x = -6\\nx = -3/8\\nThis calculation is mathematically sound and gives the value of x that minimizes z. \\\\boxed{CORRECT}</step>\\n\\n<step>Step 10 Analysis: The assistant correctly concludes that the value of x that Archimedes should choose is -3/8. This is the value that minimizes z after Brahmagupta has chosen y optimally. The solution approach was correct: first finding Brahmagupta's optimal response for any given x, then substituting that back to find Archimedes' optimal choice of x. The final answer of -3/8 is correct. \\\\boxed{CORRECT}</step>\"]}"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pro_data_list = []\n",
    "\n",
    "for i in range(len(ds[0])):\n",
    "    problem = ds[0][i][\"source\"][\"problem\"]\n",
    "    steps = ds[0][i][\"source\"][\"steps\"]\n",
    "    solution = ds[0][i][\"source\"][\"solution\"]\n",
    "    score = ds[0][i][\"source\"][\"score\"]\n",
    "    judge = [ds[j][i][\"synthetic\"][\"inference\"] for j in range(len(ds))]\n",
    "\n",
    "    pro_data_list.append({\n",
    "        \"problem\": problem,\n",
    "        \"steps\": steps,\n",
    "        \"solution\": solution,\n",
    "        \"score\": score,\n",
    "        \"judge\": judge\n",
    "    })\n",
    "\n",
    "pro_data_list[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "9a71b565",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "500\n",
      "[[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]\n",
      "54\n",
      "acc: 1.0\n",
      "mean prm: 0.8217877873144402\n",
      "pos min prm: 0.8883248730964467\n",
      "pos: 0.9222570154042236\n",
      "neg: 0.4483455621505289\n"
     ]
    }
   ],
   "source": [
    "judges = []\n",
    "for i in range(len(pro_data_list)):\n",
    "    judge = []\n",
    "    for j in range(len(pro_data_list[i][\"judge\"])):\n",
    "        judge_txt = pro_data_list[i][\"judge\"][j]\n",
    "        judge_score = extract_boxed_labels(judge_txt)\n",
    "        if judge_score==[]:\n",
    "            continue\n",
    "        judge.append(judge_score)\n",
    "    judges.append(judge)\n",
    "\n",
    "print(len(judges))\n",
    "print(judges[0])\n",
    "\n",
    "flag=0\n",
    "for i in range(len(pro_data_list)):\n",
    "    steps = pro_data_list[i][\"steps\"]\n",
    "    correct_judge = None\n",
    "    for j in range(len(judges[i])):\n",
    "        if len(judges[i][j])==steps:\n",
    "            correct_judge = judges[i][j]\n",
    "            break\n",
    "    if correct_judge is None:\n",
    "        flag+=1\n",
    "        correct_judge = [0]\n",
    "\n",
    "    pro_data_list[i][\"correct_judge\"] = correct_judge\n",
    "\n",
    "\n",
    "print(flag)\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "acc_judge = []\n",
    "acc_min = []\n",
    "acc_pos = []\n",
    "acc_neg = []\n",
    "for i in range(len(pro_data_list)):\n",
    "    acc_judge.append(np.mean(pro_data_list[i]['correct_judge']))\n",
    "    if pro_data_list[i]['score'] == 1:\n",
    "        acc_min.append(np.min(pro_data_list[i]['correct_judge']))\n",
    "        acc_pos.append(np.mean(pro_data_list[i]['correct_judge']))\n",
    "    else:\n",
    "        acc_neg.append(np.mean(pro_data_list[i]['correct_judge']))\n",
    "\n",
    "print(\"acc:\", len(acc_judge)/len(pro_data_list))\n",
    "print(\"mean prm:\", np.mean(acc_judge))\n",
    "print(\"pos min prm:\", np.mean(acc_min))\n",
    "print(\"pos:\", np.mean(acc_pos))\n",
    "print(\"neg:\", np.mean(acc_neg))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "ebbd1795",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mean step num: 7.91\n"
     ]
    }
   ],
   "source": [
    "## mean step number\n",
    "step_num_prm = [pro_data_list[i]['steps'] for i in range(len(pro_data_list))]\n",
    "print(\"mean step num:\", np.mean(step_num_prm))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "0a8197bf",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA2EAAAIjCAYAAACK6xPsAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjUsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvWftoOwAAAAlwSFlzAAAPYQAAD2EBqD+naQAAa5ZJREFUeJzt3X18zfX/x/HnOdvOZjaGsQsX21xfG5YlFeEb8k2iQvxcFX3LCivfUo1YmgqpkOob+haRvqivouSqkly2dDGXYdUumGhMdnU+vz/cdr6dtrGLs8/RPO6327nZeX/e5/V5fz7nsPP0+XzeH4thGIYAAAAAAKawunsAAAAAAHA1IYQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEALuupp56SxWIxZV3dunVTt27dHM+3bNkii8Wi9957z5T1jxw5UuHh4aasq6zOnTune++9V8HBwbJYLJowYYK7hwQAKAVCGABcZZYsWSKLxeJ4+Pj4KDQ0VL169dJLL72ks2fPumQ9KSkpeuqpp5SYmOiSeq50JY+tJJ555hktWbJE999/v9566y393//9X7F9c3Jy9OKLL6p9+/aqVq2aAgIC1KpVK40dO1b79+939Pvyyy/11FNP6cyZMyZsAQBc3TzdPQAAgHtMnz5dERERys3NVVpamrZs2aIJEyZozpw5+uCDD9S2bVtH3yeffFKPPfZYqeqnpKRo2rRpCg8PV2RkZIlf98knn5RqPWVxqbG9/vrrstvtFT6G8ti0aZOuvfZaTZ069bJ9Bw4cqHXr1mnIkCEaM2aMcnNztX//fq1du1bXXXedmjdvLuliCJs2bZpGjhypgICACt4CALi6EcIA4CrVp08fRUVFOZ5PnjxZmzZt0t///nf169dPSUlJqlKliiTJ09NTnp4V+yvj/Pnz8vX1lc1mq9D1XI6Xl5db118SJ06cUMuWLS/bb9euXVq7dq1mzJihxx9/3GnZvHnzOOoFAG7C6YgAAIfu3bsrLi5Ox48f19tvv+1oL+qasA0bNuj6669XQECA/Pz81KxZM8cX/S1btuiaa66RJI0aNcpx6uOSJUskXbzuq3Xr1tqzZ49uvPFG+fr6Ol7752vCCuTn5+vxxx9XcHCwqlatqn79+umnn35y6hMeHq6RI0cWeu0fa15ubEVdE5aVlaWHH35Y9evXl7e3t5o1a6ZZs2bJMAynfhaLRTExMVqzZo1at24tb29vtWrVSuvXry96h//JiRMndM899ygoKEg+Pj5q166d3nzzTcfyguvjjh49qg8//NAx9mPHjhVZ78iRI5KkLl26FFrm4eGhWrVqSbr4/k6aNEmSFBERUWTdt99+Wx07dlSVKlVUs2ZNDR48uND+/+P7et1116lKlSqKiIjQwoULC63/5ZdfVqtWreTr66saNWooKipKy5YtK9F+AoC/OkIYAMBJwfVFlzot8Pvvv9ff//53ZWdna/r06Zo9e7b69eunbdu2SZJatGih6dOnS5LGjh2rt956S2+99ZZuvPFGR41Tp06pT58+ioyM1Ny5c3XTTTddclwzZszQhx9+qEcffVQPPfSQNmzYoJ49e+r3338v1faVZGx/ZBiG+vXrpxdeeEG9e/fWnDlz1KxZM02aNEmxsbGF+n/xxRd64IEHNHjwYD333HO6cOGCBg4cqFOnTl1yXL///ru6deumt956S0OHDtXzzz+v6tWra+TIkXrxxRcdY3/rrbcUGBioyMhIx9hr165dZM2wsDBJ0tKlS5WXl1fsugcMGKAhQ4ZIkl544YVCdWfMmKHhw4erSZMmmjNnjiZMmKCNGzfqxhtvLHQ07fTp07rlllvUsWNHPffcc6pXr57uv/9+LVq0yNHn9ddf10MPPaSWLVtq7ty5mjZtmiIjI7Vjx45L7iMAqDQMAMBVZfHixYYkY9euXcX2qV69utG+fXvH86lTpxp//JXxwgsvGJKMkydPFltj165dhiRj8eLFhZZ17drVkGQsXLiwyGVdu3Z1PN+8ebMhyahbt66RmZnpaH/33XcNScaLL77oaAsLCzNGjBhx2ZqXGtuIESOMsLAwx/M1a9YYkoynn37aqd8dd9xhWCwW4/Dhw442SYbNZnNq++abbwxJxssvv1xoXX80d+5cQ5Lx9ttvO9pycnKMzp07G35+fk7bHhYWZvTt2/eS9QzDMOx2u2NfBwUFGUOGDDHmz59vHD9+vFDf559/3pBkHD161Kn92LFjhoeHhzFjxgyn9m+//dbw9PR0ai9Y1+zZsx1t2dnZRmRkpFGnTh0jJyfHMAzDuO2224xWrVpddvwAUFlxJAwAUIifn98lZ0ksmLjh/fffL/MkFt7e3ho1alSJ+w8fPlz+/v6O53fccYdCQkL00UcflWn9JfXRRx/Jw8NDDz30kFP7ww8/LMMwtG7dOqf2nj17qlGjRo7nbdu2VbVq1fTjjz9edj3BwcGOI1LSxevTHnroIZ07d05bt24t9dgtFos+/vhjPf3006pRo4beeecdjRs3TmFhYRo0aFCJrglbtWqV7Ha77rrrLmVkZDgewcHBatKkiTZv3uzU39PTU/fdd5/juc1m03333acTJ05oz549ki5+fn7++Wft2rWr1NsEAJUBIQwAUMi5c+ecAs+fDRo0SF26dNG9996roKAgDR48WO+++26pAlndunVLNQlHkyZNnJ5bLBY1bty42OuhXOX48eMKDQ0ttD9atGjhWP5HDRo0KFSjRo0aOn369GXX06RJE1mtzr+ai1tPSXl7e+uJJ55QUlKSUlJS9M477+jaa6/Vu+++q5iYmMu+/tChQzIMQ02aNFHt2rWdHklJSTpx4oRT/9DQUFWtWtWprWnTppLkeK8effRR+fn5qVOnTmrSpInGjRvnOJUVAK4GzI4IAHDy888/67ffflPjxo2L7VOlShV99tln2rx5sz788EOtX79eK1asUPfu3fXJJ5/Iw8PjsuspmHnRlYq7oXR+fn6JxuQKxa3H+NMkHu4QEhKiwYMHa+DAgWrVqpXeffddLVmy5JIzX9rtdlksFq1bt67IbfPz8yv1OFq0aKEDBw5o7dq1Wr9+vf7zn/9owYIFmjJliqZNm1bqegDwV8ORMACAk7feekuS1KtXr0v2s1qt6tGjh+bMmaMffvhBM2bM0KZNmxynpxUXiMrq0KFDTs8Nw9Dhw4edZjKsUaNGkafY/fkoUmnGFhYWppSUlEKnZxbc6Lhg8ovyCgsL06FDhwodTXT1eqSLpzm2bdtWubm5ysjIkFT8PmnUqJEMw1BERIR69uxZ6HHttdc69U9JSVFWVpZT28GDByXJ6b2qWrWqBg0apMWLFys5OVl9+/bVjBkzdOHCBZdtJwBcqQhhAACHTZs2KT4+XhERERo6dGix/X799ddCbQU3Pc7OzpYkxylprroX1b///W+nIPTee+8pNTVVffr0cbQ1atRIX331lXJychxta9euLTSVemnGdssttyg/P1/z5s1zan/hhRdksVic1l8et9xyi9LS0rRixQpHW15enl5++WX5+fmpa9eupa556NAhJScnF2o/c+aMtm/frho1ajhmQCxunwwYMEAeHh6aNm1aoaN5hmEUmvUxLy9Pr776quN5Tk6OXn31VdWuXVsdO3aUpEKvsdlsatmypQzDUG5ubqm3EwD+ajgdEQCuUuvWrdP+/fuVl5en9PR0bdq0SRs2bFBYWJg++OAD+fj4FPva6dOn67PPPlPfvn0VFhamEydOaMGCBapXr56uv/56SRcDUUBAgBYuXCh/f39VrVpV0dHRioiIKNN4a9asqeuvv16jRo1Senq65s6dq8aNG2vMmDGOPvfee6/ee+899e7dW3fddZeOHDmit99+22mijNKO7dZbb9VNN92kJ554QseOHVO7du30ySef6P3339eECRMK1S6rsWPH6tVXX9XIkSO1Z88ehYeH67333tO2bds0d+7cS16jV5xvvvlGd999t/r06aMbbrhBNWvW1C+//KI333xTKSkpmjt3ruMUw4KA9MQTT2jw4MHy8vLSrbfeqkaNGunpp5/W5MmTdezYMfXv31/+/v46evSoVq9erbFjx+qRRx5xrDM0NFTPPvusjh07pqZNm2rFihVKTEzUa6+95rgR9s0336zg4GB16dJFQUFBSkpK0rx589S3b98ybScA/OW4b2JGAIA7FExRX/Cw2WxGcHCw8be//c148cUXnaZCL/DnKeo3btxo3HbbbUZoaKhhs9mM0NBQY8iQIcbBgwedXvf+++8bLVu2NDw9PZ2mhO/atWuxU5QXN0X9O++8Y0yePNmoU6eOUaVKFaNv375FTrU+e/Zso27duoa3t7fRpUsXY/fu3YVqXmpsf56i3jAM4+zZs8bEiRON0NBQw8vLy2jSpInx/PPPG3a73amfJGPcuHGFxlTc1Pl/lp6ebowaNcoIDAw0bDab0aZNmyKn0S/pFPXp6enGzJkzja5duxohISGGp6enUaNGDaN79+7Ge++9V6h/fHy8UbduXcNqtRaarv4///mPcf311xtVq1Y1qlatajRv3twYN26cceDAAUefgvd19+7dRufOnQ0fHx8jLCzMmDdvntN6Xn31VePGG280atWqZXh7exuNGjUyJk2aZPz222+X3SYAqAwshnEFXCkMAAD+8rp166aMjAx999137h4KAFzRuCYMAAAAAExECAMAAAAAExHCAAAAAMBEXBMGAAAAACbiSBgAAAAAmIgQBgAAAAAm4mbNZWS325WSkiJ/f39ZLBZ3DwcAAACAmxiGobNnzyo0NFRW6+WPcxHCyiglJUX169d39zAAAAAAXCF++ukn1atX77L9CGFl5O/vL+nijq5WrZqbRwMAAADAXTIzM1W/fn1HRrgcQlgZFZyCWK1aNUIYAAAAgBJfpsTEHAAAAABgIkIYAAAAAJiIEAYAAAAAJuKaMAAAAMCFDMNQXl6e8vPz3T0UuIiHh4c8PT1ddmsqQhgAAADgIjk5OUpNTdX58+fdPRS4mK+vr0JCQmSz2cpdixAGAAAAuIDdbtfRo0fl4eGh0NBQ2Ww2lx05gfsYhqGcnBydPHlSR48eVZMmTUp0Q+ZLIYQBAAAALpCTkyO73a769evL19fX3cOBC1WpUkVeXl46fvy4cnJy5OPjU656TMwBAAAAuFB5j5LgyuTK95VPCAAAAACYiNMRAQAAgAqWnJysjIwM09YXGBioBg0amLY+lA4hDAAAAKhAycnJat6ihX43ccbEKr6+2p+URBC7QhHCAAAAgAqUkZGh38+f1wOzXlNoo6YVvr6UIwe14JGxysjIKHUIS0tLU0JCgj788EP9/PPPql69uho3bqxhw4ZpxIgR8vX1VXh4uI4fPy7p4oQVjRo10vjx43Xvvfc66mzZskU33XST43mdOnV0/fXX6/nnn1fDhg0d7V9++aWefvppbd++Xb///ruaNGmiUaNGafz48fLw8CjnnrhyEcIAAAAAE4Q2aqqIVpHuHkaxfvzxR3Xp0kUBAQF65pln1KZNG3l7e+vbb7/Va6+9prp166pfv36SpOnTp2vMmDE6f/68Vq5cqTFjxqhu3brq06ePU80DBw7I399fhw4d0tixY3Xrrbdq37598vDw0OrVq3XXXXdp1KhR2rx5swICAvTpp5/qn//8p7Zv365333230k7xTwgDAAAAoAceeECenp7avXu3qlat6mhv2LChbrvtNhmG4Wjz9/dXcHCwJOnRRx/Vc889pw0bNhQKYXXq1FFAQIBCQkI0ZcoUDR06VIcPH1a9evU0ZswY9evXT6+99pqj/7333qugoCD169dP7777rgYNGlTBW+0ezI4IAAAAXOVOnTqlTz75ROPGjXMKYH9U1FEpu92u//znPzp9+rRsNtsl11GlShVJF++n9sknn+jUqVN65JFHCvW79dZb1bRpU73zzjtl2JK/BkIYAAAAcJU7fPiwDMNQs2bNnNoDAwPl5+cnPz8/Pfroo472Rx99VH5+fvL29tYdd9yhGjVqOF0T9mepqamaNWuW6tatq2bNmungwYOSpBYtWhTZv3nz5o4+lREhDAAAAECRdu7cqcTERLVq1UrZ2dmO9kmTJikxMVGbNm1SdHS0XnjhBTVu3LjQ6+vVq6eqVasqNDRUWVlZ+s9//uN0xOyPpzheTbgmDAAAALjKNW7cWBaLRQcOHHBqL5jJsOBUwgKBgYFq3LixGjdurJUrV6pNmzaKiopSy5Ytnfp9/vnnqlatmurUqSN/f39He9OmF2eJTEpK0nXXXVdoPElJSYVqVSaEMLhNRd20kJsTAgAAlE6tWrX0t7/9TfPmzdODDz5Y7HVhRalfv74GDRqkyZMn6/3333daFhERoYCAgEKvufnmm1WzZk3Nnj27UAj74IMPdOjQIcXHx5dpW/4KCGFwi+TkZLVo0Vznz//u8tq+vlWUlLSfIAYAAK4oKUfMucaprOtZsGCBunTpoqioKD311FNq27atrFardu3apf3796tjx47Fvnb8+PFq3bq1du/eraioqMuuq2rVqnr11Vc1ePBgjR07VjExMapWrZo2btyoSZMm6Y477tBdd91Vpu34KyCEwS0yMjJ0/vzvevuZ29WiYW2X1U368aSGPb66TDcnBAAAqAiBgYGq4uurBY+MNW2dVXx9FRgYWKrXNGrUSF9//bWeeeYZTZ48WT///LO8vb3VsmVLPfLII3rggQeKfW3Lli118803a8qUKfroo49KtL477rhDmzdv1owZM3TDDTfowoULatKkiZ544glNmDCh0t4jTCKEwc1aNKytDi1C3D0MAACACtOgQQPtT0qqkMswilPWyzNCQkL08ssv6+WXXy62z7Fjx4psX79+vePnbt26lWjSjRtuuMHpdVcLQhgAAABQwRo0aMBZOnBw+xT18+fPV3h4uHx8fBQdHa2dO3cW2/f777/XwIEDFR4eLovForlz5xbqU7Dsz49x48Y5+nTr1q3Q8n/84x8VsXkAAAAA4MStIWzFihWKjY3V1KlTtXfvXrVr1069evXSiRMniux//vx5NWzYUDNnzlRwcHCRfXbt2qXU1FTHY8OGDZKkO++806nfmDFjnPo999xzrt04AAAAACiCW0PYnDlzNGbMGI0aNUotW7bUwoUL5evrq0WLFhXZ/5prrtHzzz+vwYMHy9vbu8g+tWvXVnBwsOOxdu1aNWrUSF27dnXq5+vr69SvWrVqLt8+AAAAAPgzt10TlpOToz179mjy5MmONqvVqp49e2r79u0uW8fbb7+t2NjYQrOrLF26VG+//baCg4N16623Ki4uTr6+vsXWys7OdrpLeGZmpiQpLy9PeXl5Lhnv1cRut8tms8kuq/Lsrpv5xi7rxbp2O+8LAAAwVV5engzDcDxQuRS8r0V9/y/t9063hbCMjAzl5+crKCjIqT0oKEj79+93yTrWrFmjM2fOaOTIkU7td999t8LCwhQaGqp9+/bp0Ucf1YEDB7Rq1apiayUkJGjatGmF2nfv3l2qm9nhorNnzyouLk4ZXqHakW5zXV2v+oqLq6uMjAzt2LHDZXUBAAAux2KxyNfXV+fPn1d+fr67hwMXy87OVk5Ojvbt21coZGdlZZWqVqWeHfGNN95Qnz59FBoa6tQ+duz/7tHQpk0bhYSEqEePHjpy5IgaNWpUZK3JkycrNjbW8TwzM1P169dXVFQUpzKWQWJiouLj47Xt36MVGVT09X1lqnsgTfHxi7Rt2zZFRka6rC4AAMDlXLhwQcnJyfL19ZWPj4+7hwMX8/DwkM1mU+PGjQu9vwVnyZWU20JYYGCgPDw8lJ6e7tSenp5e7KQbpXH8+HF9+umnlzy6VSA6OlqSdPjw4WJDmLe3d5HXoXl6esrTs1Jn2QphtVqVk5Mjq+zytLrucL1V9ot1rVbeFwAAYCpPT0+n2bdRuRS8r0V9/y/t9063fUu12Wzq2LGjNm7cqP79+0u6eJ3Qxo0bFRMTU+76ixcvVp06ddS3b9/L9k1MTJR08eZ0AAAAgKslJyf/JW7WDHO49VBBbGysRowYoaioKHXq1Elz585VVlaWRo0aJUkaPny46tatq4SEBEkXJ9r44YcfHD//8ssvSkxMlJ+fnxo3buyoa7fbtXjxYo0YMaJQKj1y5IiWLVumW265RbVq1dK+ffs0ceJE3XjjjWrbtq1JWw4AAICrRXJyslq0aK7z5383bZ2+vlWUlLS/UgWx8PBwTZgwQRMmTJB08cjU6tWrHQd0/krcGsIGDRqkkydPasqUKUpLS1NkZKTWr1/vmKwjOTlZVuv/ZtFPSUlR+/btHc9nzZqlWbNmqWvXrtqyZYuj/dNPP1VycrJGjx5daJ02m02ffvqpI/DVr19fAwcO1JNPPllxGwoAAICrVkZGhs6f/11vP3O7WjSsXeHrS/rxpIY9vloZGRklDmEjR47Um2++KUny8vJSgwYNNHz4cD3++OP64osvdNNNNzn6BgYG6pprrtGzzz6rNm3aFKpx3333aeHChU71x40bpwULFmjEiBFasmRJsePo1q2btm7dWqg9NzdXu3btKnZCvGPHjikiIkJff/31X2JeALdfNBMTE1Ps6Yd/DFbSxfRbkuk+b7755mL71a9fv8g3FgAAAKhILRrWVocWV+7lL71799bixYuVnZ2tjz76SOPGjZOXl5c6d+4sSTpw4ICqVaumlJQUTZo0SX379tXhw4dls/1vpuv69etr+fLleuGFF1SlShVJFycsWbZsWYkD4ZgxYzR9+nSnNk9PT9WuXfEBVroY+Ly8vCp0HW69WTMAAACAK4O3t7eCg4MVFham+++/Xz179tQHH3zgWF6nTh0FBwerQ4cOmjBhgn766adCt5bq0KGD6tev7zQ53qpVq9SgQQOnM9ouxdfXV8HBwU4P6eIBmblz5xb5moiICElS+/btZbFY1K1bN8eyf/3rX2rRooV8fHzUvHlzLViwwLHs2LFjslgsWrFihbp27SofHx8tXbq0ROMsD0IYAAAAgEKqVKminJycQu2//fabli9fLklOR8EKjB49WosXL3Y8X7RokWPOh4qyc+dOSRcvS0pNTXWEwKVLl2rKlCmaMWOGkpKS9MwzzyguLs5x6mWBxx57TOPHj1dSUpJ69epVoWOVCGEAAAAA/sAwDH366af6+OOP1b17d0d7vXr15Ofnp4CAAC1btkz9+vVT8+bNC71+2LBh+uKLL3T8+HEdP35c27Zt07Bhw0q8/gULFsjPz8/xePjhhy/7moJTFWvVqqXg4GDVrFlTkjR16lTNnj1bAwYMUEREhAYMGKCJEyfq1VdfdXr9hAkTHH3MmDHd7deEAQAAAHC/tWvXys/PT7m5ubLb7br77rv11FNPadeuXZKkzz//XL6+vvrqq6/0zDPPFJp8o0Dt2rXVt29fLVmyRIZhqG/fvgoMDHTqs3TpUt13332O5+vWrdMNN9wgSRo6dKieeOIJx7KAgIAybU9WVpaOHDmie+65R2PGjHG05+XlqXr16k59o6KiyrSOsiKEAQAAANBNN92kV155RTabTaGhoYVu9RQREaGAgAA1a9ZMJ06c0KBBg/TZZ58VWWv06NGOyffmz59faHm/fv0UHR3teF63bl3Hz9WrV3e6/VRZnTt3TpL0+uuvO61Lkjw8PJyeFzfrYkUhhAEAAABQ1apVSxx+xo0bp4SEBK1evVq33357oeW9e/dWTk6OLBZLkddY+fv7y9/fv9xjLlBwbVp+fr6jLSgoSKGhofrxxx81dOhQl63LFQhhAAAAgAmSfjxZadbj6+urMWPGaOrUqerfv78sFovTcg8PDyUlJTl+rmh16tRRlSpVtH79etWrV08+Pj6qXr26pk2bpoceekjVq1dX7969lZ2drd27d+v06dOKjY2t8HEVhxAGAAAAVKDAwED5+lbRsMdXm7ZOX98qha7DcrWYmBjNmTNHK1eu1F133VVoebVq1Sp0/X/k6empl156SdOnT9eUKVN0ww03aMuWLbr33nvl6+ur559/XpMmTVLVqlXVpk0bTZgwwbSxFcVilOTuxygkMzNT1atX12+//WbqB6yy2Lt3rzp27Kg9y8e69KaFe5NS1XHwa9qzZ486dOjgsroAAACXc+HCBR09elQRERHy8fFxWpacnKyMjAzTxhIYGFjimyOjZC71/pY2G3AkDAAAAKhgDRo0IBTBgfuEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAuBDz3lVOrnxfCWEAAACAC3h5eUmSzp8/7+aRoCIUvK8F73N5MDsiAAAA4AIeHh4KCAjQiRMnJF28ofGfb2KMvx7DMHT+/HmdOHFCAQEBLrn5NCEMAAAAcJHg4GBJcgQxVB4BAQGO97e8CGEAAACAi1gsFoWEhKhOnTrKzc1193DgIl5eXi45AlaAEAYAAAC4mIeHh0u/tKNyYWIOAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEbg9h8+fPV3h4uHx8fBQdHa2dO3cW2/f777/XwIEDFR4eLovForlz5xbq89RTT8lisTg9mjdv7tTnwoULGjdunGrVqiU/Pz8NHDhQ6enprt40AAAAACjErSFsxYoVio2N1dSpU7V37161a9dOvXr10okTJ4rsf/78eTVs2FAzZ85UcHBwsXVbtWql1NRUx+OLL75wWj5x4kT997//1cqVK7V161alpKRowIABLt02AAAAACiKW0PYnDlzNGbMGI0aNUotW7bUwoUL5evrq0WLFhXZ/5prrtHzzz+vwYMHy9vbu9i6np6eCg4OdjwCAwMdy3777Te98cYbmjNnjrp3766OHTtq8eLF+vLLL/XVV1+5fBsBAAAA4I883bXinJwc7dmzR5MnT3a0Wa1W9ezZU9u3by9X7UOHDik0NFQ+Pj7q3LmzEhIS1KBBA0nSnj17lJubq549ezr6N2/eXA0aNND27dt17bXXFlkzOztb2dnZjueZmZmSpLy8POXl5ZVrvFcju90um80mu6zKs1tcV1fWi3Xtdt4XAAAAmKK03zvdFsIyMjKUn5+voKAgp/agoCDt37+/zHWjo6O1ZMkSNWvWTKmpqZo2bZpuuOEGfffdd/L391daWppsNpsCAgIKrTctLa3YugkJCZo2bVqh9t27d6tq1aplHu/V6uzZs4qLi1OGV6h2pNtcV9ervuLi6iojI0M7duxwWV0AAACgOFlZWaXq77YQVlH69Onj+Llt27aKjo5WWFiY3n33Xd1zzz1lrjt58mTFxsY6nmdmZqp+/fqKiopStWrVyjXmq1FiYqLi4+O17d+jFRlU/PV9pa57IE3x8Yu0bds2RUZGuqwuAAAAUJyCs+RKym0hLDAwUB4eHoVmJUxPT7/kpBulFRAQoKZNm+rw4cOSpODgYOXk5OjMmTNOR8Mut15vb+8ir0Pz9PSUp2ely7IVzmq1KicnR1bZ5Wk1XFdX9ot1rVbeFwAAAJiitN873TYxh81mU8eOHbVx40ZHm91u18aNG9W5c2eXrefcuXM6cuSIQkJCJEkdO3aUl5eX03oPHDig5ORkl64XAAAAAIri1kMFsbGxGjFihKKiotSpUyfNnTtXWVlZGjVqlCRp+PDhqlu3rhISEiRdnMzjhx9+cPz8yy+/KDExUX5+fmrcuLEk6ZFHHtGtt96qsLAwpaSkaOrUqfLw8NCQIUMkSdWrV9c999yj2NhY1axZU9WqVdODDz6ozp07FzspBwAAAAC4iltD2KBBg3Ty5ElNmTJFaWlpioyM1Pr16x2TdSQnJ8tq/d/BupSUFLVv397xfNasWZo1a5a6du2qLVu2SJJ+/vlnDRkyRKdOnVLt2rV1/fXX66uvvlLt2rUdr3vhhRdktVo1cOBAZWdnq1evXlqwYIE5Gw0AAADgqmYxDMN1F+RcRTIzM1W9enX99ttvTMxRBnv37lXHjh21Z/lYdWgR4rq6SanqOPg17dmzRx06dHBZXQAAAKA4pc0Gbr1ZMwAAAABcbQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIk83T0AXPmSk5OVkZHh0ppJSUmSpJMnM/RbqK+qV6/u0voAAADAlYoQhktKTk5W8xYt9Pv58xVSf9WqVdr9hadixsUQxAAAAHBVIIThkjIyMvT7+fN6YNZrCm3U1GV1U44c1IJHxqpl5xt0ev/nOn/+PCEMAAAAVwVCGEoktFFTRbSKdHndqv7VddrlVQEAAIArFxNzAAAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAidwewubPn6/w8HD5+PgoOjpaO3fuLLbv999/r4EDByo8PFwWi0Vz584t1CchIUHXXHON/P39VadOHfXv318HDhxw6tOtWzdZLBanxz/+8Q9XbxoAAAAAFOLWELZixQrFxsZq6tSp2rt3r9q1a6devXrpxIkTRfY/f/68GjZsqJkzZyo4OLjIPlu3btW4ceP01VdfacOGDcrNzdXNN9+srKwsp35jxoxRamqq4/Hcc8+5fPsAAAAA4M883bnyOXPmaMyYMRo1apQkaeHChfrwww+1aNEiPfbYY4X6X3PNNbrmmmskqcjlkrR+/Xqn50uWLFGdOnW0Z88e3XjjjY52X1/fYoMcAAAAAFQUt4WwnJwc7dmzR5MnT3a0Wa1W9ezZU9u3b3fZen777TdJUs2aNZ3aly5dqrffflvBwcG69dZbFRcXJ19f32LrZGdnKzs72/E8MzNTkpSXl6e8vDyXjfdKY7fbZbPZZJEh2fNdVtciQzabTbJ4yOJhU75hVZ7dUu66dllls9lkt9sr9fsCAACAK0dpv3e6LYRlZGQoPz9fQUFBTu1BQUHav3+/S9Zht9s1YcIEdenSRa1bt3a033333QoLC1NoaKj27dunRx99VAcOHNCqVauKrZWQkKBp06YVat+9e7eqVq3qkvFeic6ePau4uDiFKUs+yd+6rG6YshQXF6fq9WvJJyBKB3NDdTzdVu66Z73qKy6urjIyMrRjxw4XjBQAAAC4tD9f+nQ5bj0dsaKNGzdO3333nb744gun9rFjxzp+btOmjUJCQtSjRw8dOXJEjRo1KrLW5MmTFRsb63iemZmp+vXrKyoqStWqVauYDbgCJCYmKj4+XlNXfKzwBm1cVvd40j7Fx8dr/vQ++mXPOnUbPVrBQeU/PTTxQJri4xdp27ZtioyMLP9AAQAAgMsoOEuupNwWwgIDA+Xh4aH09HSn9vT0dJdcqxUTE6O1a9fqs88+U7169S7ZNzo6WpJ0+PDhYkOYt7e3vL29C7V7enrK07PyZlmr1aqcnBwZskhWD5fVNWRRTk6OZOTLyM+Rh8UuT6tR7rpW2ZWTkyOr1Vqp3xcAAABcOUr7vdNtsyPabDZ17NhRGzdudLTZ7XZt3LhRnTt3LnNdwzAUExOj1atXa9OmTYqIiLjsaxITEyVJISEhZV4vAAAAAJSEWw8VxMbGasSIEYqKilKnTp00d+5cZWVlOWZLHD58uOrWrauEhARJFyfz+OGHHxw///LLL0pMTJSfn58aN24s6eIpiMuWLdP7778vf39/paWlSZKqV6+uKlWq6MiRI1q2bJluueUW1apVS/v27dPEiRN14403qm3btm7YCwAAAACuJm4NYYMGDdLJkyc1ZcoUpaWlKTIyUuvXr3dM1pGcnCyr9X8H61JSUtS+fXvH81mzZmnWrFnq2rWrtmzZIkl65ZVXJF28IfMfLV68WCNHjpTNZtOnn37qCHz169fXwIED9eSTT1bsxgIAAACAroCJOWJiYhQTE1PksoJgVSA8PFyGcenrhi63vH79+tq6dWupxggAAAAAruK2a8IAAAAA4GpECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAE5UphP3444+uHgcAAAAAXBXKFMIaN26sm266SW+//bYuXLjg6jEBAAAAQKVVphC2d+9etW3bVrGxsQoODtZ9992nnTt3unpsAAAAAFDplCmERUZG6sUXX1RKSooWLVqk1NRUXX/99WrdurXmzJmjkydPunqcAAAAAFAplGtiDk9PTw0YMEArV67Us88+q8OHD+uRRx5R/fr1NXz4cKWmprpqnAAAAABQKZQrhO3evVsPPPCAQkJCNGfOHD3yyCM6cuSINmzYoJSUFN12222uGicAAAAAVAqeZXnRnDlztHjxYh04cEC33HKL/v3vf+uWW26R1Xox00VERGjJkiUKDw935VgBAAAA4C+vTCHslVde0ejRozVy5EiFhIQU2adOnTp64403yjU4AAAAAKhsyhTCDh06dNk+NptNI0aMKEt5AAAAAKi0ynRN2OLFi7Vy5cpC7StXrtSbb75Z7kEBAAAAQGVVphCWkJCgwMDAQu116tTRM888U+5BAQAAAEBlVaYQlpycrIiIiELtYWFhSk5OLvegAAAAAKCyKlMIq1Onjvbt21eo/ZtvvlGtWrXKPSgAAAAAqKzKFMKGDBmihx56SJs3b1Z+fr7y8/O1adMmjR8/XoMHDy5Vrfnz5ys8PFw+Pj6Kjo7Wzp07i+37/fffa+DAgQoPD5fFYtHcuXPLVPPChQsaN26catWqJT8/Pw0cOFDp6emlGjcAAAAAlEWZQlh8fLyio6PVo0cPValSRVWqVNHNN9+s7t27l+qasBUrVig2NlZTp07V3r171a5dO/Xq1UsnTpwosv/58+fVsGFDzZw5U8HBwWWuOXHiRP33v//VypUrtXXrVqWkpGjAgAGl2wkAAAAAUAZlCmE2m00rVqzQ/v37tXTpUq1atUpHjhzRokWLZLPZSlxnzpw5GjNmjEaNGqWWLVtq4cKF8vX11aJFi4rsf8011+j555/X4MGD5e3tXaaav/32m9544w3NmTNH3bt3V8eOHbV48WJ9+eWX+uqrr0q/MwAAAACgFMp0n7ACTZs2VdOmTcv02pycHO3Zs0eTJ092tFmtVvXs2VPbt2+vsJp79uxRbm6uevbs6ejTvHlzNWjQQNu3b9e1115bZO3s7GxlZ2c7nmdmZkqS8vLylJeXV6bx/hXY7XbZbDZZZEj2fJfVtci4GNgtHrJ42JRvWJVnt5S7rl1W2Ww22e32Sv2+AAAA4MpR2u+dZQph+fn5WrJkiTZu3KgTJ07Ibrc7Ld+0adNla2RkZCg/P19BQUFO7UFBQdq/f39ZhlWimmlpabLZbAoICCjUJy0trdjaCQkJmjZtWqH23bt3q2rVqmUa71/B2bNnFRcXpzBlySf5W5fVDVOW4uLiVL1+LfkEROlgbqiOp5f8KGpxznrVV1xcXWVkZGjHjh0uGCkAAABwaVlZWaXqX6YQNn78eC1ZskR9+/ZV69atZbGU/wjGlW7y5MmKjY11PM/MzFT9+vUVFRWlatWquXFkFSsxMVHx8fGauuJjhTdo47K6x5P2KT4+XvOn99Eve9ap2+jRCg4q+jq/0kg8kKb4+EXatm2bIiMjyz9QAAAA4DIKzpIrqTKFsOXLl+vdd9/VLbfcUpaXS5ICAwPl4eFRaFbC9PT0YifdcEXN4OBg5eTk6MyZM05Hwy63Xm9v7yKvQ/P09JSnZ7nO6ryiWa1W5eTkyJBFsnq4rK4hi3JyciQjX0Z+jjwsdnlajXLXtcqunJwcWa3WSv2+AAAA4MpR2u+dZZ6Yo3HjxmV5qVONjh07auPGjY42u92ujRs3qnPnzhVWs2PHjvLy8nLqc+DAASUnJ5d5vQAAAABQUmU6VPDwww/rxRdf1Lx588p1KmJsbKxGjBihqKgoderUSXPnzlVWVpZGjRolSRo+fLjq1q2rhIQESRcn3vjhhx8cP//yyy9KTEyUn5+fIxRermb16tV1zz33KDY2VjVr1lS1atX04IMPqnPnzsVOygEAAAAArlKmEPbFF19o8+bNWrdunVq1aiUvLy+n5atWrSpRnUGDBunkyZOaMmWK0tLSFBkZqfXr1zsm1khOTpbV+r+DdSkpKWrfvr3j+axZszRr1ix17dpVW7ZsKVFNSXrhhRdktVo1cOBAZWdnq1evXlqwYEFZdgUAAAAAlEqZQlhAQIBuv/12lwwgJiZGMTExRS4rCFYFwsPDZRiXv27oUjUlycfHR/Pnz9f8+fNLNVYAAAAAKK8yhbDFixe7ehwAAAAAcFUo08Qc0sUbkn366ad69dVXdfbsWUkXTxc8d+6cywYHAAAAAJVNmY6EHT9+XL1791ZycrKys7P1t7/9Tf7+/nr22WeVnZ2thQsXunqcAAAAAFAplOlI2Pjx4xUVFaXTp0+rSpUqjvbbb7/daep3AAAAAICzMh0J+/zzz/Xll1/KZrM5tYeHh+uXX35xycAAAAAAoDIq05Ewu92u/Pz8Qu0///yz/P39yz0oAAAAAKisyhTCbr75Zs2dO9fx3GKx6Ny5c5o6dapuueUWV40NAAAAACqdMp2OOHv2bPXq1UstW7bUhQsXdPfdd+vQoUMKDAzUO++84+oxAgAAAEClUaYQVq9ePX3zzTdavny59u3bp3Pnzumee+7R0KFDnSbqAAAAAAA4K1MIkyRPT08NGzbMlWMBAAAAgEqvTCHs3//+9yWXDx8+vEyDAQAAAIDKrkwhbPz48U7Pc3Nzdf78edlsNvn6+hLCAAAAAKAYZZod8fTp006Pc+fO6cCBA7r++uuZmAMAAAAALqHM14T9WZMmTTRz5kwNGzZM+/fvd1VZlFBycrIyMjJcXjcpKcnlNQEAAICrmctCmHRxso6UlBRXlkQJJCcnq3mLFvr9/PkKW0dOdk6F1QYAAACuJmUKYR988IHTc8MwlJqaqnnz5qlLly4uGRhKLiMjQ7+fP68HZr2m0EZNXVr7m60btHLuDOXl5bm0LgAAAHC1KlMI69+/v9Nzi8Wi2rVrq3v37po9e7YrxoUyCG3UVBGtIl1aM+XIQZfWAwAAAK52ZQphdrvd1eMAAAAAgKtCmWZHBAAAAACUTZmOhMXGxpa475w5c8qyCgAAAAColMoUwr7++mt9/fXXys3NVbNmzSRJBw8elIeHhzp06ODoZ7FYXDNKAAAAAKgkyhTCbr31Vvn7++vNN99UjRo1JF28gfOoUaN0ww036OGHH3bpIAEAAACgsijTNWGzZ89WQkKCI4BJUo0aNfT0008zOyIAAAAAXEKZQlhmZqZOnjxZqP3kyZM6e/ZsuQcFAAAAAJVVmULY7bffrlGjRmnVqlX6+eef9fPPP+s///mP7rnnHg0YMMDVYwQAAACASqNM14QtXLhQjzzyiO6++27l5uZeLOTpqXvuuUfPP/+8SwcIAAAAAJVJmUKYr6+vFixYoOeff15HjhyRJDVq1EhVq1Z16eAAAAAAoLIp182aU1NTlZqaqiZNmqhq1aoyDMNV4wIAAACASqlMIezUqVPq0aOHmjZtqltuuUWpqamSpHvuuYfp6QEAAADgEsoUwiZOnCgvLy8lJyfL19fX0T5o0CCtX7/eZYMDAAAAgMqmTNeEffLJJ/r4449Vr149p/YmTZro+PHjLhkYAAAAAFRGZQphWVlZTkfACvz666/y9vYu96Bw5Th7+pQkKe3oYflWLfyel1XKkYMuqwUAAAD8lZQphN1www3697//rfj4eEmSxWKR3W7Xc889p5tuusmlA4T7ZKT8pOXPT5EkLYp7qELWkX3h9wqpCwAAAFypyhTCnnvuOfXo0UO7d+9WTk6O/vnPf+r777/Xr7/+qm3btrl6jHCTs6dPKTcnV7dHSTff1ls16gS5rPYXO37UvEWfKy83x2U1AQAAgL+CMoWw1q1b6+DBg5o3b578/f117tw5DRgwQOPGjVNISIirxwg3q+0vNYmoqTp1g11W82jyKZfVAgAAAP5KSh3CcnNz1bt3by1cuFBPPPFERYwJAAAAACqtUk9R7+XlpX379lXEWAAAAACg0ivTfcKGDRumN954w9VjAQAAAIBKr0zXhOXl5WnRokX69NNP1bFjR1WtWtVp+Zw5c1wyOAAAAACobEoVwn788UeFh4fru+++U4cOHSRJBw863+/JYrG4bnQAAAAAUMmUKoQ1adJEqamp2rx5syRp0KBBeumllxQU5LqpywEAAACgMivVNWGGYTg9X7dunbKyslw6IAAAAACozMo0MUeBP4cyAAAAAMCllSqEWSyWQtd8cQ0YAAAAAJRcqa4JMwxDI0eOlLe3tyTpwoUL+sc//lFodsRVq1a5boQAAAAAUImUKoSNGDHC6fmwYcNcOhgAAAAAqOxKFcIWL15cUePAVe5kxknX1DmZIUlKSkqSJAUGBqpBgwYuqQ0AAAC4Qplu1gy4Sm72BVks0qpVq11SL/X0xT8LjtJW8fXV/qQkghgAAACuGIQwuFV+bp4MQ2rfvbdq1C7//eYO/nhKr21eqwdmvSZJWvDIWGVkZBDCAAAAcMUghOGK4BdQUzXqBJe7jv+Zi3+GNmpa7loAAABARSjXfcIAAAAAAKVzRYSw+fPnKzw8XD4+PoqOjtbOnTsv2X/lypVq3ry5fHx81KZNG3300UdOywvuZ/bnx/PPP+/oEx4eXmj5zJkzK2T7AAAAAKCA20PYihUrFBsbq6lTp2rv3r1q166devXqpRMnThTZ/8svv9SQIUN0zz336Ouvv1b//v3Vv39/fffdd44+qampTo9FixbJYrFo4MCBTrWmT5/u1O/BBx+s0G0FAAAAALeHsDlz5mjMmDEaNWqUWrZsqYULF8rX11eLFi0qsv+LL76o3r17a9KkSWrRooXi4+PVoUMHzZs3z9EnODjY6fH+++/rpptuUsOGDZ1q+fv7O/X7802nAQAAAMDV3DoxR05Ojvbs2aPJkyc72qxWq3r27Knt27cX+Zrt27crNjbWqa1Xr15as2ZNkf3T09P14Ycf6s033yy0bObMmYqPj1eDBg109913a+LEifL0LHqXZGdnKzs72/E8MzNTkpSXl6e8vLxLbmdFs9vtstlsssiQ7Pkuq2uRIZvNJqunZMhDdsN1md1i9bg4ZqunLB42F9b3+N++kGSz2WS3293+HgEAAKDyKu13TbeGsIyMDOXn5ysoyHlq8qCgIO3fv7/I16SlpRXZPy0trcj+b775pvz9/TVgwACn9oceekgdOnRQzZo19eWXX2ry5MlKTU3VnDlziqyTkJCgadOmFWrfvXu324+gnT17VnFxcQpTlnySv3VZ3TBlKS4uTqEB0gWfIKVlebusdv0m9RUXF6V69fxkb97PZfW9ajRUXFyEwpQlSYqLi1NGRoZ27NhR7toAAABAUbKyskrVv9JPUb9o0SINHTpUPj4+Tu1/PJrWtm1b2Ww23XfffUpISJC3d+EwMHnyZKfXZGZmqn79+oqKilK1atUqbgNKIDExUfHx8Zq64mOFN2jjsrrHk/YpPj5eo7tKA0cMUu2a9VxWO3HnD4p/dp3iH2in33/5Rl1uc039/Snpio9/W1NXfCxJio+P17Zt2xQZGVnu2gAAAEBRCs6SKym3hrDAwEB5eHgoPT3dqT09PV3BwUXfMyo4OLjE/T///HMdOHBAK1asuOxYoqOjlZeXp2PHjqlZs2aFlnt7excZzjw9PYs9hdEsVqtVOTk5MmSRrB4uq2vIopycHNnzJIvyZbXYXVfbnn9xzPY8Gfk5Lqyf/799oYunvFqtVre/RwAAAKi8Svtd060Tc9hsNnXs2FEbN250tNntdm3cuFGdO3cu8jWdO3d26i9JGzZsKLL/G2+8oY4dO6pdu3aXHUtiYqKsVqvq1KlTyq0AAAAAgJJz++GB2NhYjRgxQlFRUerUqZPmzp2rrKwsjRo1SpI0fPhw1a1bVwkJCZKk8ePHq2vXrpo9e7b69u2r5cuXa/fu3Xrttdec6mZmZmrlypWaPXt2oXVu375dO3bs0E033SR/f39t375dEydO1LBhw1SjRo2K32gAAAAAVy23h7BBgwbp5MmTmjJlitLS0hQZGan169c7Jt9ITk6W1fq/A3bXXXedli1bpieffFKPP/64mjRpojVr1qh169ZOdZcvXy7DMDRkyJBC6/T29tby5cv11FNPKTs7WxEREZo4cWKhWRcBAAAAwNXcHsIkKSYmRjExMUUu27JlS6G2O++8U3feeecla44dO1Zjx44tclmHDh301VdflXqcAAAAAFBebr9ZMwAAAABcTQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIk93DwCoCClHDjp+TkpKcknNwMBANWjQwCW1AAAAcPUihKFSyfj1nCySFjwy1tE2bNgwl9T29a2ipKT9BDEAAACUCyEMlcrZc9kyJD0Zc4PC6lbXrk/WasCAAapdO7BcdZN+PKlhj69WRkYGIQwAAADlQghDpRRWt7qaNqyln2tIbZsEKiQkxN1DAgAAACQxMQcAAAAAmIoQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgoisihM2fP1/h4eHy8fFRdHS0du7cecn+K1euVPPmzeXj46M2bdroo48+clo+cuRIWSwWp0fv3r2d+vz6668aOnSoqlWrpoCAAN1zzz06d+6cy7cNAAAAAP7I7SFsxYoVio2N1dSpU7V37161a9dOvXr10okTJ4rs/+WXX2rIkCG655579PXXX6t///7q37+/vvvuO6d+vXv3VmpqquPxzjvvOC0fOnSovv/+e23YsEFr167VZ599prFjx1bYdgIAAACAdAWEsDlz5mjMmDEaNWqUWrZsqYULF8rX11eLFi0qsv+LL76o3r17a9KkSWrRooXi4+PVoUMHzZs3z6mft7e3goODHY8aNWo4liUlJWn9+vX617/+pejoaF1//fV6+eWXtXz5cqWkpFTo9gIAAAC4unm6c+U5OTnas2ePJk+e7GizWq3q2bOntm/fXuRrtm/frtjYWKe2Xr16ac2aNU5tW7ZsUZ06dVSjRg11795dTz/9tGrVquWoERAQoKioKEf/nj17ymq1aseOHbr99tsLrTc7O1vZ2dmO55mZmZKkvLw85eXllW7DXcxut8tms8kiQ7Lnu6yuRYZsNpusnpIhD9kN12V2i9Xj4pitnrJ42FxWv6CuLB4y5CGLh035hlV5dku56tpllc1mk91ud/v7DQAAgCtLab8fujWEZWRkKD8/X0FBQU7tQUFB2r9/f5GvSUtLK7J/Wlqa43nv3r01YMAARURE6MiRI3r88cfVp08fbd++XR4eHkpLS1OdOnWcanh6eqpmzZpOdf4oISFB06ZNK9S+e/duVa1atUTbW1HOnj2ruLg4hSlLPsnfuqxumLIUFxen0ADpgk+Q0rK8XVa7fpP6iouLUr16frI37+ey+gV1q9evpXNeXmrUI0IHc0N1PN1WrrpnveorLq6uMjIytGPHjnKPEwAAAJVHVlZWqfq7NYRVlMGDBzt+btOmjdq2batGjRppy5Yt6tGjR5lqTp482ekIXGZmpurXr6+oqChVq1at3GMuj8TERMXHx2vqio8V3qCNy+oeT9qn+Ph4je4qDRwxSLVr1nNZ7cSdPyj+2XWKf6Cdfv/lG3W5zTX1C+rOn95HdSJqaffGt9Vt9GgFBwWXr+6BNMXHL9K2bdsUGRlZ7nECAACg8ig4S66k3BrCAgMD5eHhofT0dKf29PR0BQcX/aU5ODi4VP0lqWHDhgoMDNThw4fVo0cPBQcHF5r4Iy8vT7/++muxdby9veXtXfhIjaenpzw93ZtlrVarcnJyZMgiWT1cVteQRTk5ObLnSRbly2qxu662Pf/imO15MvJzXFa/oK6MfFmULyM/Rx4WuzytRrnqWmVXTk6OrFar299vAAAAXFlK+/3QrRNz2Gw2dezYURs3bnS02e12bdy4UZ07dy7yNZ07d3bqL0kbNmwotr8k/fzzzzp16pRCQkIcNc6cOaM9e/Y4+mzatEl2u13R0dHl2SQAAAAAuCS3z44YGxur119/XW+++aaSkpJ0//33KysrS6NGjZIkDR8+3GnijvHjx2v9+vWaPXu29u/fr6eeekq7d+9WTEyMJOncuXOaNGmSvvrqKx07dkwbN27UbbfdpsaNG6tXr16SpBYtWqh3794aM2aMdu7cqW3btikmJkaDBw9WaGio+TsBAAAAwFXD7edVDRo0SCdPntSUKVOUlpamyMhIrV+/3jH5RnJysqzW/2XF6667TsuWLdOTTz6pxx9/XE2aNNGaNWvUunVrSZKHh4f27dunN998U2fOnFFoaKhuvvlmxcfHO51OuHTpUsXExKhHjx6yWq0aOHCgXnrpJXM3HgAAAMBVx+0hTJJiYmIcR7L+bMuWLYXa7rzzTt15551F9q9SpYo+/vjjy66zZs2aWrZsWanGCQAAAADl5fbTEQEAAADgakIIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARJ7uHgBQ0U5mnCx/jZMZkqSkpCSn9sDAQDVo0KDc9QEAAHD1IISh0rpwPksWi7Rq1epy10o9ffHPYcOGObVX8fXV/qQkghgAAABKjBCGSis3+4IMQ2rfvbdq1A4qV62DP57Sa5vX6oFZrym0UVNJUsqRg1rwyFhlZGQQwgAAAFBihDBUen4BNVWjTnC5avifufhnaKOmimgVWe4xAQAA4OrFxBwAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiTzdPQDgryTlyMFCPyclJZWrZmBgoBo0aFCuGgAAAPjrIIQBJZDx6zlZJC14ZGyhZcOGDStXbV/fKkpK2k8QAwAAuEoQwoASOHsuW4akJ2NuUOs2DS+2/XpKuz5ZqwEDBqh27cAy1U368aSGPb5aGRkZhDAAAICrBCEMKIWwutXVsmmwJOn0CennGlLbJoEKCQlx88gAAADwV8HEHAAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJrogQNn/+fIWHh8vHx0fR0dHauXPnJfuvXLlSzZs3l4+Pj9q0aaOPPvrIsSw3N1ePPvqo2rRpo6pVqyo0NFTDhw9XSkqKU43w8HBZLBanx8yZMytk+wAAAACggNtD2IoVKxQbG6upU6dq7969ateunXr16qUTJ04U2f/LL7/UkCFDdM899+jrr79W//791b9/f3333XeSpPPnz2vv3r2Ki4vT3r17tWrVKh04cED9+vUrVGv69OlKTU11PB588MEK3VYAAAAAcHsImzNnjsaMGaNRo0apZcuWWrhwoXx9fbVo0aIi+7/44ovq3bu3Jk2apBYtWig+Pl4dOnTQvHnzJEnVq1fXhg0bdNddd6lZs2a69tprNW/ePO3Zs0fJyclOtfz9/RUcHOx4VK1atcK3FwAAAMDVzdOdK8/JydGePXs0efJkR5vValXPnj21ffv2Il+zfft2xcbGOrX16tVLa9asKXY9v/32mywWiwICApzaZ86cqfj4eDVo0EB33323Jk6cKE/PondJdna2srOzHc8zMzMlSXl5ecrLy7vUZlY4u90um80miwzJnu+yuhYZstlssnpKhjxkN1yX2S1Wj4tjtnrK4mFzWf2CurJ4yJBcVvuPdQtqGfKQxcOmfMOqPLulTHXtsspms8lut7v9cwQAAICyKe33OLeGsIyMDOXn5ysoKMipPSgoSPv37y/yNWlpaUX2T0tLK7L/hQsX9Oijj2rIkCGqVq2ao/2hhx5Shw4dVLNmTX355ZeaPHmyUlNTNWfOnCLrJCQkaNq0aYXad+/e7fYjaGfPnlVcXJzClCWf5G9dVjdMWYqLi1NogHTBJ0hpWd4uq12/SX3FxUWpXj0/2Zv3c1n9grrV69eS3VNq1CPKJbX/WDct6+L7nefVUI16ROhgbqiOp9vKVPesV33FxdVVRkaGduzYUa4xAgAAwD2ysrJK1d+tIayi5ebm6q677pJhGHrllVeclv3xaFrbtm1ls9l03333KSEhQd7ehb+wT5482ek1mZmZql+/vqKiopzCnTskJiYqPj5eU1d8rPAGbVxW93jSPsXHx2t0V2ngiEGqXbOey2on7vxB8c+uU/wD7fT7L9+oy22uqV9Qd/70PgoOkI5sXOeS2n+s27RTS0nSmZPp2r3xbXUbPVrBQcFlq3sgTfHxi7Rt2zZFRkaWa4wAAABwj4Kz5ErKrSEsMDBQHh4eSk9Pd2pPT09XcHDRX2qDg4NL1L8ggB0/flybNm26bFCKjo5WXl6ejh07pmbNmhVa7u3tXWQ48/T0LPYURrNYrVbl5OTIkEWyerisriGLcnJyZM+TLMqX1WJ3XW17/sUx2/Nk5Oe4rH5BXRn5skguq/3HugW1LMqXkZ8jD4tdnlajTHWtsisnJ0dWq9XtnyMAAACUTWm/x7l1Yg6bzaaOHTtq48aNjja73a6NGzeqc+fORb6mc+fOTv0lacOGDU79CwLYoUOH9Omnn6pWrVqXHUtiYqKsVqvq1KlTxq0BAAAAgMtz+3+9x8bGasSIEYqKilKnTp00d+5cZWVladSoUZKk4cOHq27dukpISJAkjR8/Xl27dtXs2bPVt29fLV++XLt379Zrr70m6WIAu+OOO7R3716tXbtW+fn5juvFatasKZvNpu3bt2vHjh266aab5O/vr+3bt2vixIkaNmyYatSo4Z4dAQAAAOCq4PYQNmjQIJ08eVJTpkxRWlqaIiMjtX79esfkG8nJybJa/3fA7rrrrtOyZcv05JNP6vHHH1eTJk20Zs0atW7dWpL0yy+/6IMPPpCkQtfYbN68Wd26dZO3t7eWL1+up556StnZ2YqIiNDEiRMLzboImCUpKcnlNQMDA9WgQQOX1wUAAED5uD2ESVJMTIxiYmKKXLZly5ZCbXfeeafuvPPOIvuHh4fLMC59fU6HDh301VdflXqcgKulZpyTRdKwYcNcXtvXt4qSkvYTxAAAAK4wV0QIg2ukHDl4RddDYWfOXpAhad4/b1LnDk1cVjfpx5Ma9vhqZWRkEMIAAACuMISwSiA1NVUWSQseGevuoaCMGtevoQ4tQtw9DAAAAJiAEFYJnDlzRoakJ2NuUOs2DV1W94sdP2reos9dVg8AAAAAIaxSCatbXS2blu2mwUU5mnzKZbUAAAAAXOTW+4QBAAAAwNWGI2FAOZ3MOFnm154+fVqSdObMaaWmpjot8/X1VfXq1cs1NgAAAFx5CGFAGV04nyWLRVq1anWZa3ybfPHPTZs2K+nrzU7LvLw8FTMuhiAGAABQyRDCgDLKzb4gw5Dad++tGrWDylQj//MftWr352rZ+Qa1b/e/SVXOnj6lXZ+s1fnz5wlhAAAAlQwhDCgnv4CaqlGnbBOiVK1+cfKTqv7Vy1wDAAAAfy1MzAEAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoQBAAAAgIkIYQAAAABgIkIYAAAAAJiIEAYAAAAAJiKEAQAAAICJCGEAAAAAYCJCGAAAAACYiBAGAAAAACYihAEAAACAiQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAm8nT3AAAU72TGybK97mSGJCkpKanYPoGBgWrQoEGZ6gMAAKDsCGHAFejC+SxZLNKqVavL9PrU0xf/HDZsWLF9qvj6an9SEkEMAADAZIQw4AqUm31BhiG1795bNWoHlfr1B388pdc2r9UDs15TaKOmhZanHDmoBY+MVUZGBiEMAADAZIQw4ArmF1BTNeoEl/p1/mcu/hnaqKkiWkW6dEwAAAAoHybmAAAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAExECAMAAAAAExHCAAAAAMBEhDAAAAAAMBH3CQMqsZQjBy/ZnpSUVKa6gYGB3OQZAACgjAhhQCWU8es5WSQteGTsJfsNGzasTPV9fasoKWk/QQwAAKAMCGFAJXT2XLYMSU/G3KDWbRoWXv7rKe36ZK0GDBig2rUDS1U76ceTGvb4amVkZBDCAAAAyoAQBlRiYXWrq2XT4ELtp09IP9eQ2jYJVEhIiMvXm5ycrIyMDJfXLcDpkAAA4K+MEAbApZKTk9W8RQv9fv58ha2jiq+v9iclEcQAAMBfEiEMuIqdzDhZ+tecvHiEq7hJPZKSkvT7+fN6YNZrCm3UtFzjK0rKkYNa8MhYTocEAAB/WYQw4Cp04XyWLBZp1arVpX5t6umLf15uUo/AeuGKaBVZhtEBAABUboQw4CqUm31BhiG1795bNWoHleq1B388pdc2ry32SNc3Wzdo5dwZysvLc9Vwi1TW6fUvh+vNAABARSOEAVcxv4CaqlGn8MQdl+J/5uKfoY2aFnmkq7h7k7nKmZPpslgsZZ5e/3K43gwAAFQ0QhiAMikubJ38+bgkKe3oYflW9S1VTf8atRQYWv+Sfc5n/ibDMDQq/iU1at22VPUvh+vNABSoyFleOeIO4IoIYfPnz9fzzz+vtLQ0tWvXTi+//LI6depUbP+VK1cqLi5Ox44dU5MmTfTss8/qlltucSw3DENTp07V66+/rjNnzqhLly565ZVX1KRJE0efX3/9VQ8++KD++9//ymq1auDAgXrxxRfl5+dXodsK/NWV9EbQi+IeKnVt7yo+em7drssGMUkKiWjMNWcAKkRFz/LKEXcAbg9hK1asUGxsrBYuXKjo6GjNnTtXvXr10oEDB1SnTp1C/b/88ksNGTJECQkJ+vvf/65ly5apf//+2rt3r1q3bi1Jeu655/TSSy/pzTffVEREhOLi4tSrVy/98MMP8vHxkSQNHTpUqamp2rBhg3JzczVq1CiNHTtWy5YtM3X7gb+ay90IOu34j/ph++cXrzerU/LrzY4eP6XJz6zV2dOnShTCSiMj5SedPX3qsv0Kju599NFHJbrmLCAgoNT3WcvOzpa3t/dl+6WmpurMmTOlqp2bmysvL69L9snJyZHNZitV3QKX2t4r/X/2izqqUZZ9XJQ/7xdX74uKPiIjiSM+f5KRkVFhs7z+1Y+4cx/I4lXkvinp7w6pbP+2leT3WVnfGz4zRXN7CJszZ47GjBmjUaNGSZIWLlyoDz/8UIsWLdJjjz1WqP+LL76o3r17a9KkSZKk+Ph4bdiwQfPmzdPChQtlGIbmzp2rJ598Urfddpsk6d///reCgoK0Zs0aDR48WElJSVq/fr127dqlqKgoSdLLL7+sW265RbNmzVJoaKhJWw/8dRV3I2g/45RO75eaRNRUnbqlu96sImSk/KR/9rlG2b9fKPFr4uLiStTPIsko5XgsFqsMw14htUu0/gqq6+tbRUlJ+6/IX4TFHdX4K+yLij4i4+3jI6tF+r0Ufz9K40r+XJREcde+Xq24D2TxKnrflPR3h3Rl/dvGZ6Z4bg1hOTk52rNnjyZPnuxos1qt6tmzp7Zv317ka7Zv367Y2Fintl69emnNmjWSpKNHjyotLU09e/Z0LK9evbqio6O1fft2DR48WNu3b1dAQIAjgElSz549ZbVatWPHDt1+++2F1pudna3s7GzH899++03SxdMaK3oWuMs5f/68vLy89G1Sin7PLtlf0JI4cCBNXl5eSj8r7dx7VH6Hz7i89qHjZ2RkermsfkHdb5NSlFJV+um0a2r/sW7BPj6dnlbu+kXVdUXt4uq6on5F1f4l9ay8vLz09acfKfmHb4rt9+O+vfLy8tI3Wz9RxvFDl617KuUX2fPyNbR/pGrXrHLJvmdP/6q04z+qdr368q5S9dJ1z1zQ+5uOqUu/u1StVu3LjkOSfvlxv/Zt3ahOve9QrZDi/7Mn89RJbfvgXd3WPVy1AnxKVPv82d/0a1rqJcd+JDlTW3en6JbrQxVcp1qJ6hbIyf5dJ5KPq03bNvLzdT5tOzXjnF5b/a1ef/31Mv8StFqtstsLf54sFosMo3xfJ5KTk5WXm6tuA+9WtVoXz7Aoyz4uyp/3iyv2RXFjrx4YVO598UeZp05o2wcrZZc07q4OCgl07en4rt4Xf2SxWGSxWIr8zJS3rmEYSk5OLtG/RyWv+78v0KdSfpGXl5dWr16tPXv2uKB24b8jxf19Km/dov4uuaJugYLPZFk+MyX5t6Ks+6Uktcuyb0r671tJf3dIZfu37VL/vhco+Pv88ccfO13eczmHDh1SXm6u+o0dr5pBdUv8upL6Nf0XrVu8QD/++KPbLynKzMyUpJL/O2240S+//GJIMr788kun9kmTJhmdOnUq8jVeXl7GsmXLnNrmz59v1KlTxzAMw9i2bZshyUhJSXHqc+eddxp33XWXYRiGMWPGDKNp06aFateuXdtYsGBBkeudOnWqoYv/scCDBw8ePHjw4MGDBw8ehR4//fRTiXKQ209H/KuYPHmy0xE4u92uX3/9VbVq1ZLFYqmw9WZmZqp+/fr66aefVK1a6f7nGqXH/jYf+9x87HPzsc/Nxf42H/vcfOxzc11ufxuGobNnz5b4sia3hrDAwEB5eHgoPT3dqT09PV3BwUVfSxIcHHzJ/gV/pqenO11gmJ6ersjISEefEydOONXIy8vTr7/+Wux6vb29C10QGRAQcOkNdKFq1arxF8xE7G/zsc/Nxz43H/vcXOxv87HPzcc+N9el9nf16tVLXMfqqgGVhc1mU8eOHbVx40ZHm91u18aNG9W5c+ciX9O5c2en/pK0YcMGR/+IiAgFBwc79cnMzNSOHTscfTp37qwzZ844nYu9adMm2e12RUdHu2z7AAAAAODP3H46YmxsrEaMGKGoqCh16tRJc+fOVVZWlmO2xOHDh6tu3bpKSEiQJI0fP15du3bV7Nmz1bdvXy1fvly7d+/Wa6+9JuniRY4TJkzQ008/rSZNmjimqA8NDVX//v0lSS1atFDv3r01ZswYLVy4ULm5uYqJidHgwYOZGREAAABAhXJ7CBs0aJBOnjypKVOmKC0tTZGRkVq/fr2Cgi7eXyg5OVlW6/8O2F133XVatmyZnnzyST3++ONq0qSJ1qxZ47hHmCT985//VFZWlsaOHaszZ87o+uuv1/r16x33CJOkpUuXKiYmRj169HDcrPmll14yb8NLyNvbW1OnTi3xvSFQPuxv87HPzcc+Nx/73Fzsb/Oxz83HPjeXq/e3xTBcON8tAAAAAOCS3HpNGAAAAABcbQhhAAAAAGAiQhgAAAAAmIgQBgAAAAAmIoRdwebPn6/w8HD5+PgoOjpaO3fudPeQKo3PPvtMt956q0JDQ2WxWLRmzRqn5YZhaMqUKQoJCVGVKlXUs2dPHTp0yD2DrQQSEhJ0zTXXyN/fX3Xq1FH//v114MABpz4XLlzQuHHjVKtWLfn5+WngwIGFbsyOknvllVfUtm1bx00lO3furHXr1jmWs78r1syZMx23TCnAPne9p556ShaLxenRvHlzx3L2uev98ssvGjZsmGrVqqUqVaqoTZs22r17t2M5vz9dKzw8vNBn3GKxaNy4cZL4jFeE/Px8xcXFKSIiQlWqVFGjRo0UHx+vP85l6IrPOSHsCrVixQrFxsZq6tSp2rt3r9q1a6devXrpxIkT7h5apZCVlaV27dpp/vz5RS5/7rnn9NJLL2nhwoXasWOHqlatql69eunChQsmj7Ry2Lp1q8aNG6evvvpKGzZsUG5urm6++WZlZWU5+kycOFH//e9/tXLlSm3dulUpKSkaMGCAG0f911avXj3NnDlTe/bs0e7du9W9e3fddttt+v777yWxvyvSrl279Oqrr6pt27ZO7ezzitGqVSulpqY6Hl988YVjGfvctU6fPq0uXbrIy8tL69at0w8//KDZs2erRo0ajj78/nStXbt2OX2+N2zYIEm68847JfEZrwjPPvusXnnlFc2bN09JSUl69tln9dxzz+nll1929HHJ59zAFalTp07GuHHjHM/z8/ON0NBQIyEhwY2jqpwkGatXr3Y8t9vtRnBwsPH888872s6cOWN4e3sb77zzjhtGWPmcOHHCkGRs3brVMIyL+9fLy8tYuXKlo09SUpIhydi+fbu7hlnp1KhRw/jXv/7F/q5AZ8+eNZo0aWJs2LDB6Nq1qzF+/HjDMPiMV5SpU6ca7dq1K3IZ+9z1Hn30UeP6668vdjm/Pyve+PHjjUaNGhl2u53PeAXp27evMXr0aKe2AQMGGEOHDjUMw3Wfc46EXYFycnK0Z88e9ezZ09FmtVrVs2dPbd++3Y0juzocPXpUaWlpTvu/evXqio6OZv+7yG+//SZJqlmzpiRpz549ys3NddrnzZs3V4MGDdjnLpCfn6/ly5crKytLnTt3Zn9XoHHjxqlv375O+1biM16RDh06pNDQUDVs2FBDhw5VcnKyJPZ5Rfjggw8UFRWlO++8U3Xq1FH79u31+uuvO5bz+7Ni5eTk6O2339bo0aNlsVj4jFeQ6667Ths3btTBgwclSd98842++OIL9enTR5LrPueerh02XCEjI0P5+fkKCgpyag8KCtL+/fvdNKqrR1pamiQVuf8LlqHs7Ha7JkyYoC5duqh169aSLu5zm82mgIAAp77s8/L59ttv1blzZ124cEF+fn5avXq1WrZsqcTERPZ3BVi+fLn27t2rXbt2FVrGZ7xiREdHa8mSJWrWrJlSU1M1bdo03XDDDfruu+/Y5xXgxx9/1CuvvKLY2Fg9/vjj2rVrlx566CHZbDaNGDGC358VbM2aNTpz5oxGjhwpiX9XKspjjz2mzMxMNW/eXB4eHsrPz9eMGTM0dOhQSa77nkgIA2CqcePG6bvvvnO6bgMVo1mzZkpMTNRvv/2m9957TyNGjNDWrVvdPaxK6aefftL48eO1YcMG+fj4uHs4V42C/5mWpLZt2yo6OlphYWF69913VaVKFTeOrHKy2+2KiorSM888I0lq3769vvvuOy1cuFAjRoxw8+gqvzfeeEN9+vRRaGiou4dSqb377rtaunSpli1bplatWikxMVETJkxQaGioSz/nnI54BQoMDJSHh0eh2W3S09MVHBzsplFdPQr2Mfvf9WJiYrR27Vpt3rxZ9erVc7QHBwcrJydHZ86ccerPPi8fm82mxo0bq2PHjkpISFC7du304osvsr8rwJ49e3TixAl16NBBnp6e8vT01NatW/XSSy/J09NTQUFB7HMTBAQEqGnTpjp8+DCf8woQEhKili1bOrW1aNHCcQoovz8rzvHjx/Xpp5/q3nvvdbTxGa8YkyZN0mOPPabBgwerTZs2+r//+z9NnDhRCQkJklz3OSeEXYFsNps6duyojRs3Otrsdrs2btyozp07u3FkV4eIiAgFBwc77f/MzEzt2LGD/V9GhmEoJiZGq1ev1qZNmxQREeG0vGPHjvLy8nLa5wcOHFBycjL73IXsdruys7PZ3xWgR48e+vbbb5WYmOh4REVFaejQoY6f2ecV79y5czpy5IhCQkL4nFeALl26FLq9yMGDBxUWFiaJ358VafHixapTp4769u3raOMzXjHOnz8vq9U5Inl4eMhut0ty4efcJdOIwOWWL19ueHt7G0uWLDF++OEHY+zYsUZAQICRlpbm7qFVCmfPnjW+/vpr4+uvvzYkGXPmzDG+/vpr4/jx44ZhGMbMmTONgIAA4/333zf27dtn3HbbbUZERITx+++/u3nkf03333+/Ub16dWPLli1Gamqq43H+/HlHn3/84x9GgwYNjE2bNhm7d+82OnfubHTu3NmNo/5re+yxx4ytW7caR48eNfbt22c89thjhsViMT755BPDMNjfZvjj7IiGwT6vCA8//LCxZcsW4+jRo8a2bduMnj17GoGBgcaJEycMw2Cfu9rOnTsNT09PY8aMGcahQ4eMpUuXGr6+vsbbb7/t6MPvT9fLz883GjRoYDz66KOFlvEZd70RI0YYdevWNdauXWscPXrUWLVqlREYGGj885//dPRxxeecEHYFe/nll40GDRoYNpvN6NSpk/HVV1+5e0iVxubNmw1JhR4jRowwDOPi9KNxcXFGUFCQ4e3tbfTo0cM4cOCAewf9F1bUvpZkLF682NHn999/Nx544AGjRo0ahq+vr3H77bcbqamp7hv0X9zo0aONsLAww2azGbVr1zZ69OjhCGCGwf42w59DGPvc9QYNGmSEhIQYNpvNqFu3rjFo0CDj8OHDjuXsc9f773//a7Ru3drw9vY2mjdvbrz22mtOy/n96Xoff/yxIanI/chn3PUyMzON8ePHGw0aNDB8fHyMhg0bGk888YSRnZ3t6OOKz7nFMP5w+2cAAAAAQIXimjAAAAAAMBEhDAAAAABMRAgDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMABApXXs2DFZLBYlJia6eygO+/fv17XXXisfHx9FRka6ezgAADcghAEAKszIkSNlsVg0c+ZMp/Y1a9bIYrG4aVTuNXXqVFWtWlUHDhzQxo0bi+xz8uRJ3X///WrQoIG8vb0VHBysXr16adu2bY4+FotFa9asMWnUAABXIoQBACqUj4+Pnn32WZ0+fdrdQ3GZnJycMr/2yJEjuv766xUWFqZatWoV2WfgwIH6+uuv9eabb+rgwYP64IMP1K1bN506darM6wUAXDkIYQCACtWzZ08FBwcrISGh2D5PPfVUoVPz5s6dq/DwcMfzkSNHqn///nrmmWcUFBSkgIAATZ8+XXl5eZo0aZJq1qypevXqafHixYXq79+/X9ddd518fHzUunVrbd261Wn5d999pz59+sjPz09BQUH6v//7P2VkZDiWd+vWTTExMZowYYICAwPVq1evIrfDbrdr+vTpqlevnry9vRUZGan169c7llssFu3Zs0fTp0+XxWLRU089VajGmTNn9Pnnn+vZZ5/VTTfdpLCwMHXq1EmTJ09Wv379JMmxX26//XZZLBan/fT++++rQ4cO8vHxUcOGDTVt2jTl5eU5jeGVV15Rnz59VKVKFTVs2FDvvfeeY3lOTo5iYmIUEhIiHx8fhYWFXfK9AwCUHiEMAFChPDw89Mwzz+jll1/Wzz//XK5amzZtUkpKij777DPNmTNHU6dO1d///nfVqFFDO3bs0D/+8Q/dd999hdYzadIkPfzww/r666/VuXNn3XrrrY6jSmfOnFH37t3Vvn177d69W+vXr1d6erruuusupxpvvvmmbDabtm3bpoULFxY5vhdffFGzZ8/WrFmztG/fPvXq1Uv9+vXToUOHJEmpqalq1aqVHn74YaWmpuqRRx4pVMPPz09+fn5as2aNsrOzi1zPrl27JEmLFy9Wamqq4/nnn3+u4cOHa/z48frhhx/06quvasmSJZoxY4bT6+Pi4jRw4EB98803Gjp0qAYPHqykpCRJ0ksvvaQPPvhA7777rg4cOKClS5c6hTwAgAsYAABUkBEjRhi33XabYRiGce211xqjR482DMMwVq9ebfzxV9DUqVONdu3aOb32hRdeMMLCwpxqhYWFGfn5+Y62Zs2aGTfccIPjeV5enlG1alXjnXfeMQzDMI4ePWpIMmbOnOnok5uba9SrV8949tlnDcMwjPj4eOPmm292WvdPP/1kSDIOHDhgGIZhdO3a1Wjfvv1ltzc0NNSYMWOGU9s111xjPPDAA47n7dq1M6ZOnXrJOu+9955Ro0YNw8fHx7juuuuMyZMnG998841TH0nG6tWrndp69OhhPPPMM05tb731lhESEuL0un/84x9OfaKjo43777/fMAzDePDBB43u3bsbdrv9kmMEAJQdR8IAAKZ49tln9eabbzqOuJRFq1atZLX+71dXUFCQ2rRp43ju4eGhWrVq6cSJE06v69y5s+NnT09PRUVFOcbxzTffaPPmzY4jUH5+fmrevLmki9dvFejYseMlx5aZmamUlBR16dLFqb1Lly6l3uaBAwcqJSVFH3zwgXr37q0tW7aoQ4cOWrJkySVf980332j69OlO2zJmzBilpqbq/Pnzjn5/3B8FzwvGOHLkSCUmJqpZs2Z66KGH9Mknn5Rq7ACAy/N09wAAAFeHG2+8Ub169dLkyZM1cuRIp2VWq1WGYTi15ebmFqrh5eXl9NxisRTZZrfbSzyuc+fO6dZbb9Wzzz5baFlISIjj56pVq5a4piv4+Pjob3/7m/72t78pLi5O9957r6ZOnVpo3/3RuXPnNG3aNA0YMKDIeiXRoUMHHT16VOvWrdOnn36qu+66Sz179nS6bgwAUD4cCQMAmGbmzJn673//q+3btzu1165dW2lpaU5BzJX39vrqq68cP+fl5WnPnj1q0aKFpIuh4/vvv1d4eLgaN27s9ChN8KpWrZpCQ0OdppGXpG3btqlly5bl3oaWLVsqKyvL8dzLy0v5+flOfTp06KADBw4U2o7GjRs7HUH84/4oeF6wPwq2ZdCgQXr99de1YsUK/ec//9Gvv/5a7m0AAFzEkTAAgGnatGmjoUOH6qWXXnJq79atm06ePKnnnntOd9xxh9avX69169apWrVqLlnv/Pnz1aRJE7Vo0UIvvPCCTp8+rdGjR0uSxo0bp9dff11DhgzRP//5T9WsWVOHDx/W8uXL9a9//UseHh4lXs+kSZM0depUNWrUSJGRkVq8eLESExO1dOnSEtc4deqU7rzzTo0ePVpt27aVv7+/du/ereeee0633Xabo194eLg2btyoLl26yNvbWzVq1NCUKVP097//XQ0aNNAdd9whq9Wqb775Rt99952efvppx2tXrlypqKgoXX/99Vq6dKl27typN954Q5I0Z84chYSEqH379rJarVq5cqWCg4MVEBBQ4m0AAFwaR8IAAKaaPn16odMFW7RooQULFmj+/Plq166ddu7cWeTMgWU1c+ZMzZw5U+3atdMXX3yhDz74QIGBgZLkOHqVn5+vm2++WW3atNGECRMUEBDgdPSoJB566CHFxsbq4YcfVps2bbR+/Xp98MEHatKkSYlr+Pn5KTo6Wi+88IJuvPFGtW7dWnFxcRozZozmzZvn6Dd79mxt2LBB9evXV/v27SVJvXr10tq1a/XJJ5/ommuu0bXXXqsXXnhBYWFhTuuYNm2ali9frrZt2+rf//633nnnHcfROn9/fz333HOKiorSNddco2PHjumjjz4q9b4AABTPYvz5JHwAAFBpWSwWrV69Wv3793f3UADgqsV/awEAAACAiQhhAAAAAGAiJuYAAOAqwlUIAOB+HAkDAAAAABMRwgAAAADARIQwAAAAADARIQwAAAAATEQIAwAAAAATEcIAAAAAwESEMAAAAAAwESEMAAAAAEz0/67zHaTohZj+AAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 1000x600 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# bar plot\n",
    "# bar plot of ent_split_steps\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "plt.figure(figsize=(10, 6))\n",
    "sns.histplot(step_num_grpo, bins=30, kde=False, color='skyblue', stat=\"density\")\n",
    "sns.histplot(step_num_prm, bins=30, kde=False, color='orange', alpha=0.5, stat=\"density\")\n",
    "plt.legend(['GRPO', 'PRM-Filter'])\n",
    "plt.xlabel(\"Number of Steps\")\n",
    "plt.ylabel(\"Frequency\")\n",
    "plt.title(\"Distribution of Steps\")\n",
    "plt.grid(axis='y', alpha=0.75)\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "verl",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
