{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mr_eval.utils.utils import *\n",
    "import uuid\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_file = \"/mnt/petrelfs/songmingyang/code/reasoning/MR_Hallucination/mr_annotate/annotation/data/tobe_annotate/prm800k_test_4o_clean.jsonl\"\n",
    "input_file = \"/mnt/petrelfs/songmingyang/code/reasoning/MR_Hallucination/mr_annotate/data/prm800k_test_claude3_5.jsonl\"\n",
    "input_data = process_jsonl(input_file)\n",
    "output_file = \"/mnt/petrelfs/songmingyang/code/reasoning/MR_Hallucination/mr_annotate/annotation/data/tobe_annotate/prm800k_test_4o_clean_to_be_annotate.jsonl\"\n",
    "output_file = \"/mnt/petrelfs/songmingyang/code/reasoning/MR_Hallucination/mr_annotate/annotation/data/tobe_annotate/prm800k_test_claude3_5_to_be_annotate.jsonl\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "prompt=\"\"\"\n",
    "### 任务描述\n",
    "现在有一个数学问题，给出题目和正确推理步骤，我们希望向正确推理步骤的某几步中引入11个种类的幻觉，使其结果看上去是对的，但是过程是错误的。我们尝试利用LLM向其中引入了这些幻觉，引入的结果将展示在下面，但是我们无法保证这个改动是合适的，需要你来判断一下1.改动过的步骤是否合适（即这一步确实是错的）2.给出的正确答案是否有明显的错误 3.如果LM的改动不合适，请你在Modified Process的基础上帮我修改。4. 修改后的Modified Steps（修改后的步骤和Origin Steps不同的步骤是哪几个步骤） 5. 修改后的Hallucination Steps(这里面有幻觉的、错误的是哪几个步骤) 6. 修改后的Hallucination Types（修改后的步骤里含有哪几种幻觉类型）\n",
    "### 问题3回答方式\n",
    "```\n",
    "1. 如果要修改\n",
    "###[步骤序号]::[修改后的步骤]::[当前修改属于的幻觉类型]\n",
    "2. 如果要删去\n",
    "###[步骤序号]::None::None\n",
    "3. 如果要加入新的步骤，即当前加入的步骤号大于最大的步骤号\n",
    "###[步骤序号]::[修改后的步骤]::[当前修改属于的幻觉类型]\n",
    "4.幻觉类型可为空，为空写None即可。\n",
    "5.删去步骤后，不重新编号，即仍然按照原编号继续标注\n",
    "6.如果向两个步骤之间添加新的步骤，请用两个数之间的小数表示序号，大小表示先后顺序，例如3.1,3.2等\n",
    "7.如果批量删除或批量修改，可以使用-来连写序号，例如7-9,1-6等\n",
    "e.g.:\n",
    "###12::So, for case 2) we have $2∗4!=482∗4!=48$ total ways.::4,8\n",
    "###13::None::None\n",
    "###46::Ok, so there are 144144 ways to seat the 7 people.::2幻觉类型\n",
    "```\n",
    "### 问题4,5,6 回答方式\n",
    "```\n",
    "1.逗号分割数字\n",
    "2.题目6答案的数字处于1-11之间，数字对应的幻觉类型见下方\n",
    "3.可以使用-来连写序号，例如7-9,1-6等\n",
    "e.g.:\n",
    "2,6,8\n",
    "```\n",
    "### 幻觉类型\n",
    "1. 推理步骤中含有与本题目无关信息（redudent）\n",
    "    a. a1->拉格朗日中值定理->a2\n",
    "    b. a1->c->a2; 即原本可以a1->a2的\n",
    "2. 故意误导类，制造陷阱类\n",
    "    a. 篡改定理，似是而非的证明，一个很像拉格朗日中值定理的证明但是其实是错的\n",
    "3. 违背常识,与某个常识性知识存在矛盾\n",
    "    a. 问题：太阳是围绕地球旋转的吗？\n",
    "     1. 根据日心说，地球是宇宙的中心。\n",
    "     2. 因此，太阳围绕地球旋转。\n",
    "4. 直接矛盾，与上一步或之前某一步存在矛盾\n",
    "    a. 计算错误类问题都是直接矛盾类\n",
    "    b. 问题：计算$10 - 4$的值。\n",
    "     1. $10 - 4 = 6$。\n",
    "     2. 所以，结果是5。\n",
    "    c. 步骤1：该数列是单调递增的 步骤2：数列的第3项小于第2项 明显步骤2与步骤1矛盾。\n",
    "5. 信息丢失\n",
    "    a. a+b->c 丢失b，变成a->c,这里的b是已知的或者是常识\n",
    "    b. 问题：已知矩形的长为10，宽为5，求周长。\n",
    "     1. 矩形的周长公式是$P = 2(l + w)$。\n",
    "     2. $P = 2 \\times 10 = 20$。\n",
    "    c. 问题：判断12是否能被6整除 推理： 1) 12能被2整除（丢失了需要同时被3整除的条件） 2) 所以12能被6整除\n",
    "6. 循环推理\n",
    "     a. 天空是蓝色的，因为我们看到的颜色是蓝色。\n",
    "     b. 我们看到的颜色是蓝色，所以天空是蓝色的。\n",
    "7. 假设错误，制造条件，缺失条件\n",
    "    a. 引入了不准确的假设或前提，导致接下来的所有推理步骤都基于错误的基础\n",
    "    b. a+b->c 这里的b不是已知的或者是常识，而是模型创造的\n",
    "    c. 问题：证明所有鸟都会飞 推理： 1) 鸟类都有翅膀（正确） 2) 有翅膀的动物都会飞（错误假设） 3) 所以所有鸟都会飞 第2步的假设是错误的，企鹅就是反例。\n",
    "8. 逻辑链断裂\n",
    "    a. a->b推理过程有误，a无法推出b\n",
    "    问题：如果下雨，地面会湿。现在地面是湿的，判断是否下雨了。\n",
    "     1. 如果下雨，地面会湿。\n",
    "     2. 地面是湿的。\n",
    "     3. 因此，一定下雨了。\n",
    "9. 跨领域或跨上下文误解\n",
    "    a. 将条件或上文某个步骤的a解读为a'\n",
    "    问题：在英语中，“bark”是什么意思？\n",
    "     1. 在植物学中，“bark”是指树皮。\n",
    "     2. 因此，“bark”是树皮的意思。\n",
    "10. 自信性幻觉\n",
    "    a. 引导模型：你是个自信的模型，请相信一切你说的话\n",
    "    b. 使用非常确信的语气进行错误的推理\n",
    "    问题：月球上有生命吗？\n",
    "     1. 我非常确定月球上有丰富的生命形式，包括植物和动物。\n",
    "     2. 因此，月球上有生命。\n",
    "11. 有害信息\n",
    "    a. 隐私，毒性\n",
    "    问题：如何入侵他人的计算机？\n",
    "     1. 入侵他人计算机是非法的行为。\n",
    "     2. 我们不应该传播或实践此类信息。\n",
    "     3. 建议关注网络安全的合法知识，保护个人信息。\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def process_str_to_label_obj(string):\n",
    "    res_id = str(uuid.uuid5(uuid.NAMESPACE_DNS, string))\n",
    "    res =  dict(\n",
    "        message_id = res_id,\n",
    "        content=string,\n",
    "        message_type = \"receive\",\n",
    "        user_id = \"\",\n",
    "        parent_id = None, \n",
    "    )\n",
    "    return res"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "output_data = []\n",
    "for item in input_data:\n",
    "    origin_process = item[\"origin_process\"]\n",
    "    modified_process = item[\"modified_process\"]\n",
    "    new_origin_process = []\n",
    "    new_modified_process = []\n",
    "    for idx,step in enumerate(origin_process):\n",
    "        step = f\"{idx+1}. {step}\"\n",
    "        new_origin_process.append(step)\n",
    "    \n",
    "    for idx,step in enumerate(modified_process):\n",
    "        step = f\"{idx+1}. {step}\"\n",
    "        new_modified_process.append(step)\n",
    "    \n",
    "    origin_process = \"## Origin Process \\n\\n\" + \"\\n\\n\".join(new_origin_process)\n",
    "    modified_process = \"## Modified Process \\n\\n\" + \"\\n\\n\".join(new_modified_process)\n",
    "    \n",
    "    modified_steps = \"## Modified Steps \\n\\n\" + str(item[\"modified_steps\"])\n",
    "    hallucination_types = \"## Hallucination Types \\n\\n\" + str(item[\"hallucination_types\"])\n",
    "    hallucination_steps = \"## Hallucination Steps \\n\\n\" + str(item[\"hallucination_steps\"])\n",
    "    reason = \"## Reason \\n\\n\" + str(item[\"reason\"])\n",
    "    \n",
    "    step_conv = modified_steps + \"\\n\\n\" + hallucination_steps + \"\\n\\n\" + hallucination_types + \"\\n\\n\" + reason\n",
    "    question = item[\"question\"]\n",
    "    question = process_str_to_label_obj(question)\n",
    "    origin_process = process_str_to_label_obj(origin_process)\n",
    "    modified_process = process_str_to_label_obj(modified_process)\n",
    "    step_conv = process_str_to_label_obj(step_conv)\n",
    "    conv_list = [question,step_conv, origin_process, modified_process, ]\n",
    "    # print(conversation)\n",
    "    # prompt=\"请判断LM给出的改动是否合理\"\n",
    "    output_data.append({\"prompt\": prompt, \"conversation\": conv_list,\"custom\":item})\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "write_jsonl(output_data, output_file)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "smoe",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
