{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Prompt 测试 Pipeline\n",
    "\n",
    "prompt 路线：\n",
    "text -> jsonline -> prompt_template(str) -> prompt -> 函数测试！！！\n",
    "\n",
    "user prompt是比较复杂且很长的，并有很多冗余指令，选择封装 prompt_template 完成\n",
    "为了保证结果的准确统一，在添加一个 prompt 之前一定要严格检查，通过以下说明的一些测试"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### text2jsonline\n",
    "提示词交由 deepseek 完成，提示词还得优化测试，下面是测试过的，只不过键名粘贴到 prompt.json 中还需调整\n",
    "\n",
    "上面是一些我想展示的 prompt template，我想让你帮我编写一行的 json，键名为 prompt，用合适的换行符，使得我打印这个 json 的时候展示的是上面的内容。你可能需要以下的转换\n",
    "转义 JSON 大括号：\n",
    "  - 将 JSON 数据中的 { 改为 {{，} 改为 }}。实现转义，防止后面使用 format 方法将 json 数据的{}识别为占位符\n",
    "保留动态变量：\n",
    "  - 需要替换的变量（如 {domain}、{description}）保持单括号不变。\n",
    "\n",
    "例如下面就是我送给 llm ，他返回的一个 jsonline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Bayesian Optimization is in progress and is currently at iteration {iteration}.\n",
      "You generated the following comments in the previous iterations and your suggestion points were subsequently evaluated and appended to the dataset:\n",
      "{comment_history}\n",
      "\n",
      "New experimental data has been obtained giving us currently a dataset of {iteration} experiments:\n",
      "{trial_data}\n",
      "\n",
      "**Task**:\n",
      " - Reflect on your previous hypotheses and on the entire data.\n",
      " - What do we know from the data?\n",
      " - Continue live commenting on the optimization progress.\n",
      " - What are the subspaces that maximize {target} the most?\n",
      " - What are the subspaces that are detrimental and minimize {target}?\n",
      " - Improve your previous hypotheses for maximizing {target} in the light of the new comment and learnings. Feel free to also discard some hypotheses that have proven to be performing consistently poorly in previous iterations, and/or propose new ones. Additionally for each hypothesis, give a single point {constraint} you want the Bayesian Optimizer to try.\n",
      " - Important: Only provide your response in the exact JSON response format as same as comment in the first round, without any additional syntax or libraries.\n",
      "\n",
      "```json\n",
      "{{\n",
      "  \"comment\": \"...\",\n",
      "  \"hypotheses\": [\n",
      "    {{\n",
      "      \"name\": \"...\",\n",
      "      \"rationale\": \"...\",\n",
      "      \"confidence\": \"...\",\n",
      "      \"points\": [{{...}}]\n",
      "    }}\n",
      "  ]\n",
      "}}\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "jsonline = {\n",
    "    \"prompt\": \"The Bayesian Optimization is in progress and is currently at iteration {iteration}.\\nYou generated the following comments in the previous iterations and your suggestion points were subsequently evaluated and appended to the dataset:\\n{comment_history}\\n\\nNew experimental data has been obtained giving us currently a dataset of {iteration} experiments:\\n{trial_data}\\n\\n**Task**:\\n - Reflect on your previous hypotheses and on the entire data.\\n - What do we know from the data?\\n - Continue live commenting on the optimization progress.\\n - What are the subspaces that maximize {target} the most?\\n - What are the subspaces that are detrimental and minimize {target}?\\n - Improve your previous hypotheses for maximizing {target} in the light of the new comment and learnings. Feel free to also discard some hypotheses that have proven to be performing consistently poorly in previous iterations, and/or propose new ones. Additionally for each hypothesis, give a single point {constraint} you want the Bayesian Optimizer to try.\\n - Important: Only provide your response in the exact JSON response format as same as comment in the first round, without any additional syntax or libraries.\\n\\n```json\\n{{\\n  \\\"comment\\\": \\\"...\\\",\\n  \\\"hypotheses\\\": [\\n    {{\\n      \\\"name\\\": \\\"...\\\",\\n      \\\"rationale\\\": \\\"...\\\",\\n      \\\"confidence\\\": \\\"...\\\",\\n      \\\"points\\\": [{{...}}]\\n    }}\\n  ]\\n}}\\n```\"\n",
    "}\n",
    "print(jsonline['prompt'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### jsonline2prompt_template\n",
    "\n",
    "1. 把 jsonline 放置到 src/prompts 的 test_prompts.json 文件夹下面，注意格式\n",
    "2. 通过 prompt manager加载打印，其内容应该和上面👆一致"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Error loading agent_prompts.json: Expecting value: line 1 column 1 (char 0)\n",
      "Error loading llm_prompts.json: Expecting value: line 1 column 1 (char 0)\n",
      "The Bayesian Optimization is in progress and is currently at iteration {iteration}.\n",
      "You generated the following comments in the previous iterations and your suggestion points were subsequently evaluated and appended to the dataset:\n",
      "{comment_history}\n",
      "\n",
      "New experimental data has been obtained giving us currently a dataset of {iteration} experiments:\n",
      "{trial_data}\n",
      "\n",
      "**Task**:\n",
      " - Reflect on your previous hypotheses and on the entire data.\n",
      " - What do we know from the data?\n",
      " - Continue live commenting on the optimization progress.\n",
      " - What are the subspaces that maximize {target} the most?\n",
      " - What are the subspaces that are detrimental and minimize {target}?\n",
      " - Improve your previous hypotheses for maximizing {target} in the light of the new comment and learnings. Feel free to also discard some hypotheses that have proven to be performing consistently poorly in previous iterations, and/or propose new ones. Additionally for each hypothesis, give a single point {constraint} you want the Bayesian Optimizer to try.\n",
      " - Important: Only provide your response in the exact JSON response format as same as comment in the first round, without any additional syntax or libraries.\n",
      "\n",
      "```json\n",
      "{{\n",
      "  \"comment\": \"...\",\n",
      "  \"hypotheses\": [\n",
      "    {{\n",
      "      \"name\": \"...\",\n",
      "      \"rationale\": \"...\",\n",
      "      \"confidence\": \"...\",\n",
      "      \"points\": [{{...}}]\n",
      "    }}\n",
      "  ]\n",
      "}}\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "from src.prompts.base import PromptManager\n",
    "\n",
    "pm = PromptManager()\n",
    "print(pm.get(key=\"optimization_loop\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### prompt_template->prompt\n",
    "\n",
    "1. 自己写一些测试 meta_dict，包含 prompt_template 中的所有{}中的内容，meta_dict 为 python dict\n",
    "2. 通过 PromptManager 的 format方法测试能否 format 通过"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The Bayesian Optimization is in progress and is currently at iteration 4.\n",
      "You generated the following comments in the previous iterations and your suggestion points were subsequently evaluated and appended to the dataset:\n",
      "3242\n",
      "\n",
      "New experimental data has been obtained giving us currently a dataset of 4 experiments:\n",
      "sdfasdf\n",
      "\n",
      "**Task**:\n",
      " - Reflect on your previous hypotheses and on the entire data.\n",
      " - What do we know from the data?\n",
      " - Continue live commenting on the optimization progress.\n",
      " - What are the subspaces that maximize afasdf the most?\n",
      " - What are the subspaces that are detrimental and minimize afasdf?\n",
      " - Improve your previous hypotheses for maximizing afasdf in the light of the new comment and learnings. Feel free to also discard some hypotheses that have proven to be performing consistently poorly in previous iterations, and/or propose new ones. Additionally for each hypothesis, give a single point asdfasdf you want the Bayesian Optimizer to try.\n",
      " - Important: Only provide your response in the exact JSON response format as same as comment in the first round, without any additional syntax or libraries.\n",
      "\n",
      "```json\n",
      "{\n",
      "  \"comment\": \"...\",\n",
      "  \"hypotheses\": [\n",
      "    {\n",
      "      \"name\": \"...\",\n",
      "      \"rationale\": \"...\",\n",
      "      \"confidence\": \"...\",\n",
      "      \"points\": [{...}]\n",
      "    }\n",
      "  ]\n",
      "}\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "meta_dict = {\n",
    "    \"comment_history\": \"3242\",\n",
    "    \"target\": \"afasdf\",\n",
    "    \"trial_data\": \"sdfasdf\",\n",
    "    \"iteration\": 4,\n",
    "    \"constraint\": \"asdfasdf\",\n",
    "}\n",
    "print(pm.format(\"optimization_loop\", **meta_dict))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 函数测试！！！\n",
    "放在具体使用的函数中测试\n",
    "1. 封装正确的 meta_dict，需要包含上面👆meta_dict 的所有键\n",
    "2. 使用.format 方法，打印 format 后的 prompt\n",
    "3. 正确封装好的 format，输入目标函数，查看 llm 输出是否符合预期"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "bo",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
