{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "4fe2183c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-07-16T19:08:16.445212Z",
     "start_time": "2023-07-16T19:08:15.869273Z"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import openai\n",
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "dfa070b6",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-07-16T19:13:18.896675Z",
     "start_time": "2023-07-16T19:13:18.888376Z"
    }
   },
   "outputs": [],
   "source": [
    "prompt1 = '''Example:\n",
    "Paragraph: As a spring breeze wafted into his trench French commander Georges Lamour saw something surreal drift his way - a yellow-green cloud. 'All my trenches are choked,' he cried into the field telephone to headquarters. 'I am falling myself!' Chlorine gas — carried by favourable winds over Flanders Fields from German positions — had been used for the first time. It was April 22, 1915. Scroll down for video Chlorine gas — carried by favourable winds over Flanders Fields from German positions — sowed terror and agony for the first time on April 22, 1915. Above, German Red Cross workers carry bottles of water to help revive troops. German forces launched first attack using gas on April 22, 1915. 150,000 tons of gas were used by German and Allied forces in WW1.\n",
    "\n",
    "Query: Had they been able to peer a bit further across no-man's land they would have seen how [X] troops had dug in, under cover of night, more than 5,000 gas cylinders with tubes pointing their way.\n",
    "Answer: German\n",
    "\n",
    "Explanation: The query fits the context of WW1 and talks about both entities involved in the war. The paragraph mentions German forces using gas. German is the entity replaced by [X] in the query.\n",
    "\n",
    "Instruction: Generate a complex sentence that fits the context of the given paragraph.\n",
    "\n",
    "The generated query must be a statement about the events in the paragraph. Put [X] in place of any one entity mention. The query must not contain any events mentioned in the paragraph. The answer must contain the entity mention that can replace [X].\n",
    "\n",
    "Paragraph: '''\n",
    "\n",
    "# prompt2 = \"\\n Paragraphs must be independent of one another. Number the paragraphs. \""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48e7b185",
   "metadata": {},
   "outputs": [],
   "source": [
    "def chat_gpt_response(sample):\n",
    "    final_prompt = prompt1 + sample\n",
    "    response=openai.ChatCompletion.create(\n",
    "        model=\"gpt-3.5-turbo\",\n",
    "        messages= [{\"role\": \"user\", \"content\": final_prompt}],\n",
    "        temperature=0.7,\n",
    "        max_tokens=4096,\n",
    "        top_p=1,\n",
    "        frequency_penalty=0,\n",
    "        presence_penalty=0,\n",
    "        stop=None)\n",
    "    \n",
    "    reply = response[\"choices\"][0][\"message\"][\"content\"]\n",
    "    return reply"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "a368a35d",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-07-16T19:08:37.196585Z",
     "start_time": "2023-07-16T19:08:37.120871Z"
    }
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"syn_passage_list.csv\")\n",
    "ip_list = list(df['passage_list'])\n",
    "ip_list2 = ip_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1539a352",
   "metadata": {},
   "outputs": [],
   "source": [
    "count = 0\n",
    "\n",
    "batch_size = 100  #Tunable parameter\n",
    "\n",
    "while len(ip_list2) > 0:\n",
    "    temp_list = ip_list2[:batch_size]\n",
    " \n",
    "    output_list = []\n",
    "\n",
    "    for item in temp_list:\n",
    "        output_list.append(chat_gpt_response(item))\n",
    "    \n",
    "    data = {'Input': temp_list,'Output': output_list }\n",
    "    df_ans = pd.DataFrame(data)\n",
    "    df_ans.to_csv(\"output_record_ans_shard_\" + str(count) + \".csv\", index = False)\n",
    "    count+=1\n",
    "    print(\"shard number:\",count,\"Done\")\n",
    "    ip_list2 = ip_list2[batch_size:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b87b5350",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c2523219",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:root] *",
   "language": "python",
   "name": "conda-root-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
