{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# writing prompt augmentation data task"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LAION-AI/Open-Assistant/blob/main/notebooks/data-augmentation/writing-prompt/writing_prompt.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal of this task was to auto-generate question/answer samples from writingPrompts to feed openAssistant. To do that we should standardize the way a prompt was written. Our choice was to set prompt templates which might turn the generation process feasible. Here are the templates we applied:\n",
    "\n",
    "* Base template: every prompt would have this sample.\n",
    "> User: write me a story about: {stripped_prompt} -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\n",
    "\n",
    "where `stripped_promt` is the cleared prompt output by regex pattern to take out parts of a prompt that would not fit the template. And `story` is the actual answer to a prompt.\n",
    "\n",
    "* General constraints: a prompt whose constraint was found by regex pattern would have this also.\n",
    "> Base template, {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\n",
    "\n",
    "where `stripped_constraint` is the constraint found.\n",
    "\n",
    "* Answer beginning constraints: this constraint was imposed by the way the answer should start.  \n",
    "> Base template, starting with: {beginning} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beginning}:\\n{story}\n",
    "\n",
    "where `beginning` is the first sentence of a story.\n",
    "\n",
    "* Answer end constraints: this constraint was imposed by the way the answer should end.  \n",
    "> Base template, ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\n",
    "\n",
    "where `ending` is the last sentence of a story.\n",
    "\n",
    "* Answer middle constraints: this constraint was imposed by the way the answer should have in its middle text.  \n",
    "> Base template, where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\n",
    "\n",
    "where `middle` is a summary of a story without the first and last sentence brought by a generative model\n",
    "\n",
    "To get the samples we used the following pipeline:\n",
    "\n",
    "* **Get data**: download from kaggle\n",
    "* **Pre-processing**: load data from entails source/taget (aka: prompt/story) by every split (train/valid/test) merging into one pandas dataframe, enhancing tit with tabular info about the sample tags.\n",
    "* **Triage prompts**: we pick prompts sorted by frequency, and we built regex pattern for some of them to extract a striped prompt and the related constraint.\n",
    "* **Split stories**: after removing story beginning and ending sentences, we applied a sentence sliding window to get stories middle summaries."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get data from Kaggle\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "import json\n",
    "\n",
    "\n",
    "def save_credentials(d):\n",
    "    with open(\"/root/.kaggle/kaggle.json\", \"w\") as outfile:\n",
    "        json.dump(d, outfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mv: cannot stat '/mnt/home/fabraz/kaggle.json': No such file or directory\n"
     ]
    }
   ],
   "source": [
    "# uncomment the following instructions, in case you want to save a .kaggle.json\n",
    "# d = {}\n",
    "# d['username'] = 'user'\n",
    "# d['key'] = 'key'\n",
    "#!mkdir ~/.kaggle\n",
    "# save_credentials(d)\n",
    "!mv ~/kaggle.json ~/.kaggle/\n",
    "!chmod 600 ~/.kaggle/kaggle.json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install kaggle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/bin/bash: kaggle: command not found\n"
     ]
    }
   ],
   "source": [
    "!kaggle datasets download -d ratthachat/writing-prompts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Archive:  writing-prompts.zip\n",
      "  inflating: writingPrompts/README   \n",
      "  inflating: writingPrompts/test.wp_source  \n",
      "  inflating: writingPrompts/test.wp_target  \n",
      "  inflating: writingPrompts/train.wp_source  \n",
      "  inflating: writingPrompts/train.wp_target  \n",
      "  inflating: writingPrompts/valid.wp_source  \n",
      "  inflating: writingPrompts/valid.wp_target  \n"
     ]
    }
   ],
   "source": [
    "!unzip writing-prompts.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pre-processing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from IPython.display import display, HTML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "import re\n",
    "\n",
    "\n",
    "def load_file(path, names):\n",
    "    with open(path, \"r\") as f:\n",
    "        lines = f.readlines()\n",
    "    return pd.DataFrame(lines, columns=names)\n",
    "\n",
    "\n",
    "def load_data():\n",
    "    tags = {\n",
    "        \"WP\": \"Writing Prompt\",\n",
    "        \"SP\": \"Simple Prompt\",\n",
    "        \"EU\": \"Established Universe\",\n",
    "        \"CW\": \"Constrained Writing\",\n",
    "        \"TT\": \"Theme Thursday\",\n",
    "        \"PM\": \"Prompt Me\",\n",
    "        \"MP\": \"Media Prompt\",\n",
    "        \"IP\": \"Image Prompt\",\n",
    "        \"PI\": \"Prompt Inspired\",\n",
    "        \"OT\": \"Off Topic\",\n",
    "        \"RF\": \"Reality Fiction\",\n",
    "    }\n",
    "\n",
    "    dfConcat = pd.DataFrame()\n",
    "    for split in [\"train\", \"valid\", \"test\"]:\n",
    "        df = load_file(f\"writingPrompts/{split}.wp_source\", [\"prompt\"])\n",
    "        for tag in tags.keys():\n",
    "            df[tag.lower()] = df[\"prompt\"].map(lambda x: check_tag(x, tag.lower()))\n",
    "        df[\"tagCounter\"] = df.iloc[:, [2, -1]].sum(axis=1)\n",
    "        df[\"splitLineIndex\"] = df.index\n",
    "        story = load_file(f\"writingPrompts/{split}.wp_target\", [\"story\"])\n",
    "        df[\"story\"] = story[\"story\"]\n",
    "        df[\"split\"] = split\n",
    "        dfConcat = pd.concat([dfConcat, df])\n",
    "    return dfConcat\n",
    "\n",
    "\n",
    "def check_tag(item, tag):\n",
    "    r = re.compile(r\"[\\(\\{\\[]\\s*[\\w]{2}\\s*[\\]\\}\\)]\\s*\")\n",
    "    m = r.findall(item.lower())\n",
    "    if len(m) > 0:\n",
    "        for group in m:\n",
    "            if tag in group:\n",
    "                return 1\n",
    "    return 0\n",
    "\n",
    "\n",
    "def show_data(df):\n",
    "    html_string = \"<\"\n",
    "    html_string += \"html><\"\n",
    "    html_string += \"head><title>HTML Pandas Dataframe with CSS</title></head\"\n",
    "    html_string += \"><\"\n",
    "    html_string += 'link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/'\n",
    "    html_string += \"><\"\n",
    "    html_string += \"\"\"body>\n",
    "                    {table}\n",
    "                  </body>\n",
    "                </html\n",
    "                \"\"\"\n",
    "    html_string += \">\"\n",
    "    df = df.replace(\"\\<newline\\>|\\< newline \\>|\\<new line\\>\", \"\\n\", regex=True)\n",
    "    df.style.set_properties(**{\"text-align\": \"left\"}).set_table_styles(\n",
    "        [dict(selector=\"th\", props=[(\"text-align\", \"left\")])]\n",
    "    )\n",
    "    html = df.to_html()\n",
    "    html_string = html_string.format(table=html)\n",
    "    html_string = (\n",
    "        html_string.replace(r\"\\n\", \"<br>\")\n",
    "        .replace(\"<td>\", '<td style=\"text-align:left\">')\n",
    "        .replace(\"<th>\", '<th style=\"text-align:left\">')\n",
    "    )\n",
    "    display(HTML(html_string))\n",
    "\n",
    "\n",
    "def get_samples(df, n, constraint=None, show=True):\n",
    "    samples = zip(df[\"prompt\"].iloc[:n, 0].index, df[\"prompt\"].iloc[:n, 0], df[\"story\"].iloc[:n, 0])\n",
    "    df = pd.DataFrame(samples, columns=[\"index\", \"prompt\", \"story\"])\n",
    "    if constraint is not None:\n",
    "        df = df[df[\"prompt\"].str.contains(constraint)]\n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ WP ] Leonardo DiCaprio in a fit of rage begins to torpedo his own career by deliberately acting poorly and taking on bad films . He finally wins an oscar for starring in Paul Blart : Mall Cop 3 .\n",
      "[ CW ] Kill the writer in first-person narrative .\n"
     ]
    }
   ],
   "source": [
    "!head -n2 writingPrompts/test.wp_source"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>prompt</th>\n",
       "      <th>wp</th>\n",
       "      <th>sp</th>\n",
       "      <th>eu</th>\n",
       "      <th>cw</th>\n",
       "      <th>tt</th>\n",
       "      <th>pm</th>\n",
       "      <th>mp</th>\n",
       "      <th>ip</th>\n",
       "      <th>pi</th>\n",
       "      <th>ot</th>\n",
       "      <th>rf</th>\n",
       "      <th>tagCounter</th>\n",
       "      <th>splitLineIndex</th>\n",
       "      <th>story</th>\n",
       "      <th>split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[ WP ] You 've finally managed to discover the...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>So many times have I walked on ruins , the rem...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[ WP ] The moon is actually a giant egg , and ...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>-Week 18 aboard the Depth Reaver , Circa 2023-...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[ WP ] You find a rip in time walking through ...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>I was feckin ' sloshed , mate . First time I e...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                              prompt  wp  sp  eu  cw  tt  pm  \\\n",
       "0  [ WP ] You 've finally managed to discover the...   1   0   0   0   0   0   \n",
       "1  [ WP ] The moon is actually a giant egg , and ...   1   0   0   0   0   0   \n",
       "2  [ WP ] You find a rip in time walking through ...   1   0   0   0   0   0   \n",
       "\n",
       "   mp  ip  pi  ot  rf  tagCounter  splitLineIndex  \\\n",
       "0   0   0   0   0   0           0               0   \n",
       "1   0   0   0   0   0           0               1   \n",
       "2   0   0   0   0   0           0               2   \n",
       "\n",
       "                                               story  split  \n",
       "0  So many times have I walked on ruins , the rem...  train  \n",
       "1  -Week 18 aboard the Depth Reaver , Circa 2023-...  train  \n",
       "2  I was feckin ' sloshed , mate . First time I e...  train  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(303358, 16)\n"
     ]
    }
   ],
   "source": [
    "print(ds.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['splitLineIndex', 'prompt', 'story', 'split'], dtype='object')"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds[ds[\"split\"] == \"test\"].iloc[:2, [13, 0, 14, -1]].columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Samples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<html><head><title>HTML Pandas Dataframe with CSS</title></head><link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/><body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] You 've finally managed to discover the secret to immortality . Suddenly , Death appears before you , hands you a business card , and says , `` When you realize living forever sucks , call this number , I 've got a job offer for you . ''<br></td>\n",
       "      <td style=\"text-align:left\">So many times have I walked on ruins , the remainings of places that I loved and got used to.. At first I was scared , each time I could feel my city , my current generation collapse , break into the black hole that thrives within it , I could feel humanity , the way I 'm able to feel my body.. After a few hundred years , the pattern became obvious , no longer the war and damage that would devastate me over and over again in the far past was effecting me so dominantly . <br> It 's funny , but I felt as if after gaining what I desired so long , what I have lived for my entire life , only then , when I achieved immortality I started truly aging . <br> <br> 5 world wars have passed , and now they feel like a simple sickeness that would pass by every so often , I could no longer evaluate the individual human as a being of its own , the importance of mortals is merely the same as the importance of my skin cells ; They are a part of a mechanism so much more advanced , a mechanism that is so dear to my fallen heart a mechanism that I have seen fall and rise so many times , a mechanism that when lost all of which it had , had me loosing my will to live , for the first time in all of my thousands years of existence . <br> <br> Acceptance , something so important . a skill that has proved itself worthy dozens of times , an ability that looks so easy to achieve , a gift , that I was n't able to aquire in all my years , until now . When the ashes on the ground flew into the now empty air upon humanity 's fall , I felt as if all of it 's weight was crushing me . Ignorance took over and I searched years for a hope , a sign of the very same patterns that I used to watch reappear every hundred years , the very core of my will to exist that was now no more that I so strongly wish was . <br> <br> If you have ever wondered if silence can drive people crazy , it can.. <br> I ca n't feel my legs , I have walked for days , just to hear the sound of gravel , crushed bones , crushed buildings and crushed civilizations under my steps to keep my sanity.. until I remembered , the day in my far past . The day of my rebirth , I took out of my pocket a small plastic box , with nine buttons and a small glass window . I could n't believe this was our past , I could n't believe how far we have been able to progress and yet , be destroyed by our own violence . <br> I slowly dialed the number I was given , exactly 1729 years ago . <br> <br> I dropped a tear , a tear that was too slow to hit the ground as I got sucked into the darkness that emerged around me . <br> <br> A chill went through my spine as I saw my destiny rise above me , I could see the white teeth under the dark cloack ... <br> <br> `` You have finally arrived '' He projected into my mind , with the most chilling cold and unhuman voice . <br> <br> `` I 'm ready to obey '' I answered . I knew who was sitting infront of me , and it was time for me to obey him , after all these years of playing god , even I came to it . <br> <br> Funny is n't it ? Even by achieving immortality , death , is inescapable .<br></td>\n",
       "      <td style=\"text-align:left\">train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ WP ] The moon is actually a giant egg , and it has just started to hatch .<br></td>\n",
       "      <td style=\"text-align:left\">-Week 18 aboard the Depth Reaver , Circa 2023- <br> <br> I walk about the dull gray halls , the artificial gravity making my steps feel almost as if they were on land . Almost . I glance out a window as I pass it by . There 's the sun , and there 's the moon right there . And , of course , there 's the Earth . I kinda miss it . Then again , space is pretty cool . It 's got some brilliant views , and the wifi is surprisingly good . Even countless miles away from the Earth , I can crush Silver noobs on CS GO . <br> <br> I pass by Dale Malkowitz , the head scientist on board . <br> <br> `` Evening , Dale , '' I say . <br> <br> `` What up , Danny ? '' he replies cordially . <br> <br> `` Nothin ' much . A little bored , I guess . '' <br> <br> He shakes his head in disbelief . `` I really , *really* do n't understand how you can be bored in space . '' <br> <br> `` Well hey , '' I say slightly defensively , `` Aside from the views , it 's kinda ... dull . And empty . And stuff . '' <br> <br> `` Whatever you say , Wittell , '' he says , not unkindly . Then he walks off . A few moments pass , and then I decide to look out the window right by me . As my eyes scan the inky blackness of space ( again ) , I notice something odd about the moon 's surface . It 's slightly ... cracked . <br> <br> `` Hey , Malkowitz ? '' I call out , `` You might wan na check this out ! '' <br> <br> He walks over to me casually , probably expecting nothing . `` What ? '' he asks , `` What do you see ? '' <br> <br> I point at the moon . His brow furrows . `` Huh ... I guess there 's something up with the surface . I 'll have to look into tha- '' <br> <br> Suddenly , the surface cracks a little more . We glance at each other , and then back at the moon , and then at each other again , and then back at the moon again . <br> <br> `` What 's going on ? '' I ask , alarmed . <br> <br> He 's silent for a minute or two , mouth hanging open . Then , he calls out : `` Janice ! Terry ! Johnny ! Get over here ! Something 's up with the moon . '' <br> <br> The other crewmates enter , unsure of what to expect . As their eyes lay upon the moon 's surface cracks , they widen . <br> <br> And , by coincidence , more cracks appear at that very moment . And then more . And more . And more . And more ... <br> <br> Little bits of the moon begin to float away , torn free of the rest of the surface . We all stare , speechless . And then ... it happens . It *happens* . <br> <br> The side of the moon facing us is ... torn away by a ... <br> <br> Human ... hand ? <br> <br> And we see ... <br> <br> A giant ... human face ? ! <br> <br> Surprisingly , I can hear my thoughts over my racing heart . *I ca n't help but feel as if I recognize that face ... from the ... * <br> <br> *Internet . * <br> <br> Suddenly , the great face 's lips move . <br> <br> Of course , none of us can actually *hear* it speak , because of the laws of space and whatnot . However , I can read its lips , and it appears to be saying : <br> <br> `` Are you sure about that ? ''<br></td>\n",
       "      <td style=\"text-align:left\">train</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html\n",
       "                >"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"train\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Valid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<html><head><title>HTML Pandas Dataframe with CSS</title></head><link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/><body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Every person in the world undergoes a `` goodness '' test . It 's designed to give a score from 1 to 200 , where 1 is pure evil , and 200 is an angel in human body . Then the world is divided into 200 zones , where people can live among their own kind .<br></td>\n",
       "      <td style=\"text-align:left\">Clancy Marguerian , 154 , private first class of the 150+ army , sits in his foxhole . Tired cold , wet and hungry , the only thing preventing him from laying down his rifle and walking towards the enemy lines in surrender is the knowledge that however bad he has it here , life as a 50-100 POW is surely much worse . He 's fighting to keep his eyes open and his rifle ready when the mortar shells start landing near him . <br> <br> He hunkers lower . <br> <br> After a few minutes under the barrage , Marguerian hears hurried footsteps , a grunt , and a thud as a soldier leaps into the foxhole . The man 's uniform is tan , he must be a 50-100 . <br> <br> The two men snarl and grab at eachother , grappling in the small foxhole . Abruptly , their faces come together . <br> <br> `` Clancy ? '' <br> <br> `` Rob ? '' <br> <br> Rob Hall , 97 , Corporal in the 50-100 army grins , as the situation turns from life or death struggle , to a meeting of two college friends . He lets go of Marguerian 's collar . <br> <br> `` Holy shit Clancy , you 're the last person I expected to see here '' <br> <br> `` Yeah '' <br> <br> `` Shit man , I did n't think I 'd ever see 'Mr . volunteers every saturday morning at the food shelf ' , not after The Reorganization at least '' <br> <br> `` Yeah Rob , it is something is n't it '' <br> <br> `` Man , I 'm sorry I tried to kill you there , hey , I heard you guys were out of food , here , you can share my dinner '' <br> <br> Clancy marvels , even after all this : The Reorganization , the coalitions , the war , Rob is still his old , chatty self . <br> <br> The two men sit , Rob chatting away , Clancy forcing out pleasantries . They pass Rob 's rations between them . <br> <br> <br> `` Clancy my man , I heard a group of terrorist 5 's took have formed some kind of cult , and they 're rallying all the &lt; 50 in their own coalition '' <br> <br> `` Oh yeah ? '' <br> <br> `` Yeah , I mean , that sucks and everything , cause those are some scary dudes , but I heard that there 's going to be a truce between our countries in a few days , why do n't we just hang out here , pretty soon we wo n't even be enemies anymore ! '' <br> <br> `` Yeah , Rob , that sounds like a plan '' <br> <br> `` Man , I 'm so glad I found you again , in a few days , this war will be over , and things will be cool between us and , hey , remember Sarah ? I heard she 's a 151 , maybe I 'll look her up , I 'll be sure to visit you too once I can get a pass to sector 150-155 , it 'll probably be tough though , even before the war , you had to do sooo much paperwork to be allowed to visit , I wonder if passes will even be reinstated after the truce ends , hey , did I ever tell you about the time ... '' <br> <br> Rob babbles as he dozes off , grinning up at Clancy . <br> <br> When Clancy is sure that his friend is asleep , he slits Rob 's throat with his bayonet . Clancy climbs out of the foxhole , and stumbles his way back to battalion HQ .<br></td>\n",
       "      <td style=\"text-align:left\">valid</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Space mining is on the rise . The Space tanker Exxon Valdez 2.0 crash and spill its cargo . Write a news story covering the event .<br></td>\n",
       "      <td style=\"text-align:left\">„… and the little duckling will never be able to walk again. ” <br> <br> The artificial intelligence paused a moment for dramatic effect before continuing with its broadcast with a different voice . <br> <br> “ What a hearth breaking story , Frank . But now to another story that may leave you feel equally dirty . The automated space tanker Exxon Valdez 2.0 collided with an asteroid on its way to the Jupiter moon Ganymede . According to the ship owner the ship is out of control and leaking its content into space. ” <br> <br> “ That ’ s right , Fred . And the content of the ship has it in it , as they say ” , the computer said in first voice again , “ The whole tanker was filled with ‘ biological waste products ’ coming from research and mining stations in the Kuiper Belt. ” <br> <br> “ Biological waste products ? You don ’ t mean ... ” <br> <br> “ Yes , Fred ! ” Dramatic pause . “ I am talking about poop . Lots of it . And apparently it ’ s spilling everywhere. ” <br> <br> “ Better call the plumbers , Frank. ” <br> <br> “ Not any time soon , Fred . A spokesperson of the ship owner stated and I quote – ‘ Space is kind of big and empty , we expect no one to care , so why should we ? ’ Apparently they will just build a new ship and be done with it. ” <br> <br> “ That ’ s one way not to deal with the problem . But why doesn ’ t the ship fly home ? Shouldn ’ t the AI on board be able to handle such a problem ? ” <br> <br> “ Well , the issue is that the part in charge to deal with asteroid impacts like that has been impacted by the asteroid. ” <br> <br> “ Ouch . Talk about a bad run. ” <br> <br> “ True , especially if you take the name of the ship in consideration. ” <br> <br> “ Oh ? Exxon Valdez 2.0 it was , isn ’ t that right , Frank ? ” <br> <br> “ You ’ re absolutely right , Fred . Did you know the ship was named after an infamous ship of the twentieth century back on old Earth ? Apparently the Exxon Valdez of old was used for transporting petroleum across the oceans of Earth . Petroleum , as some of our listeners might not know , was a brownish black , gooey liquid comprised of biological matter which was transformed under high pressure for millions of years . Quite ironically the Exxon Valdez was infamous for crashing and spilling its cargo. ” <br> <br> “ Well , talk about making a bad name for yourself . Now both ships will go down in history for spilling black gooey stuff where it doesn ’ t belong . Who had that bright idea for such a name anyway ? ” <br> <br> “ Well , Fred , the company made its first plunder by holding a naming contest on the internet. ” <br> <br> “ Oh , will they ever learn ? ” <br> <br> “ Apparently not , Fred . Predictably someone tried to make a joke out of it . A niche side of history role players got wind of the contest and made it its goal to get it named after the infamous Exxon Valdez . Apparently they thought it would be funny , and given the content both ships were ferrying around , they might have a point. ” <br> <br> “ Funny , indeed , Frank . What ’ s the name of the side ? ” <br> <br> “ Well , Fred , it ’ s called Reddit . The people there mostly talk in outdated lingo and memes and watch cat pictures back from a time when the internet only was local on Earth. ” <br> <br> “ Truly a herald of the dark ages. ” <br> <br> “ You might be right about that , Fred . I assume they just thought it was funny . I guess this happens , when you let the internet decide on things. ” <br> <br> “ Well , Frank , when you think about the content both ships were ferrying around , they might have been right . Embarrassing for the company , but funny for everyone else. ” <br> <br> “ It might get worse than that , Fred . Environmentalists are up in arms . They claim that the human waste products spilling out of the ship might collide with Jupiter ’ s moon Europa within the next few millennia and might contaminate the biospheres with Earth life . Apparently there are a lot of bacteria and the likes in poop and some might be able survive the harsh conditions of space and end up impacting on the restricted moon. ” <br> <br> “ Oh dear , Frank , does the Monolith know about it yet ? I am sure it won ’ t let us hear the end of it. ” <br><br></td>\n",
       "      <td style=\"text-align:left\">valid</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html\n",
       "                >"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"valid\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<html><head><title>HTML Pandas Dataframe with CSS</title></head><link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/><body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Leonardo DiCaprio in a fit of rage begins to torpedo his own career by deliberately acting poorly and taking on bad films . He finally wins an oscar for starring in Paul Blart : Mall Cop 3 .<br></td>\n",
       "      <td style=\"text-align:left\">The wet marble floor pressed on his cheek like a thousand hands slapping his face frozen in time . Smattering piss of rain ignored his indignant mumblings . His eyes fluttered . Pins and needs ran from finger to shoulder as he pushed back against the floor , contorting his aching body into a cross legged position . Last night was bad . He gathered that . His routine dullness of though crept inwards from the edges of his mind toward the black mist that veiled his most recent memories . He struggled to recall whatever he could n't recall but only for a moment before he decided it probably was n't worth the effort . <br> He glanced around the room for a few minutes before concluding that he probably did n't know where he was . His investigation was n't entirely fruitless , he discovered a mostly full bottle of vodka . It was cheap but would definitely get the job done . Taking a few swigs made it childishly easy to ignore that gigantic black cloud of fog blotting out whatever the hell he did before he woke up . <br> There was a mirror in the room and for want of anything more interesting to study he gazed at himself . It was a game he 'd play with himself , glancing at the mirror and seeing if he could recognize the person looking back . If he did n't know better he 'd have guessed he was a very successful mattress salesman , or perhaps a bum who had managed to score some luck gambling . <br> His face was portly and unshaven , in that limbo place where it had been too many days without being clean and too few days to become a beard . His stomach was round but firm , like a basketball stuffed under a shirt and then semi deflated . The hair was long and unruly , receding far into the past . But his eyes were the giveaway . Looking closely enough at them he could still see an intensity . It was n't the sharp kind he carried in his youth but rather like a rusted dagger . Still sharp enough to cut . <br> `` DiCaprio . '' The curse rasped out of him in a choke . After all these years spent working on the hallmark channel and tv series based on mediocre movies he was still there . Despite his best efforts to bury himself under all of the alchol and drugs he was still in there . He thought for sure after the bankruptcy he 'd be done , but no that god damned rerelease of Titanic the royalties started pouring in and he could n't get rid of the money . Not even the live action version of the nut job could destroy him . <br> Cursing he hurled the bottle at the mirror but his wet hands slipped and instead of a shattering crash there was only a thud as the bottle bounced off the dry wall and rolled on the floor . <br> His rage thwarted by his impotence he slumped against the floor and finally noticed why there was rain coming into this room . <br> <br> The window was smashed . He looked at the bottle , confused . No , he had n't done that . At least not with the vodka . He looked back at the glass etched around the window sill and his eyes hung on the red that stained the jagged teeth . <br> <br> The headache crept back towards the front of his mind while the bloody glass pinned his eyes in place . What the fuck happened last night ?<br></td>\n",
       "      <td style=\"text-align:left\">test</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ CW ] Kill the writer in first-person narrative .<br></td>\n",
       "      <td style=\"text-align:left\">It 's been three days since my boyfriend pissed off the neighbors . <br> <br> They had to be pissed , he called the police on them . The neighbors had been harboring a runaway criminal . We did n't live in a bad neighborhood , there were families and good people living here with solid steady jobs . They cared about their yards and such . But , there was a bad egg , our neighbors to the south of us were shady . We could hear them yelling at their dog many times a week . Strange smoke often came out of their house , and the lights in the garage were on at odd hours . We never had proof until now that our concerns are legitimate . <br> <br> The car the escaped criminal was driving had been parked at the neighbor 's house and my boyfriend decided he should turn them in . This lead to the police parking in front of *our* house , and watching them through our bedroom window for hours until they caught him . They had to know it was us . And it freaked me out . <br> <br> I had started tucking my pink taser in my jacket pocket when I took my miniature Yorkie out to go potty . My neighbor to the north , Jay , seemed to notice my tension , so when he saw me step outside , he 'd come out and chat with me . He 'd ask me about work , and talk to me about his latest construction jobs . Jay always pretend to be grabbing something out of his massive pick-up truck . It usually followed the same pattern - he grabs something out of his truck , sees me out with my dog , then starts in on how it baffles him how such a tiny dog was smarter than most of the people he worked with . We 'd both gripe about our jobs and laugh about stupid customers , chase the puppy down when she tried to go after squirrels , and then part ways until the next potty break . <br> <br> The sun was beginning to set when my dog started doing her potty dance by the door . I put on my jacket , slipped my taser in my pocket , and opened the door . She bolted out the door and went straight for the squirrel sniffing around the sidewalk . <br> <br> `` NO ! BAD GIRL , COME HERE ! '' The squirrel started running across the road and her tiny legs skittered out of it . I ran after her , swearing as I tripped over a crack in the road . I felt a snap in my ankle and I went down . <br> <br> The roar of a large pick-up engine was too close and I did n't know what to look at - my little dog bouncing across the neighbor 's lawn , or the tires that were n't slowing down fast enough . I chose neither and closed my eyes . <br> <br> The last thing I heard was the clatter of of work boots and Jay voice cracking , `` Oh god , oh god , oh god ... '' <br> <br><br></td>\n",
       "      <td style=\"text-align:left\">test</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html\n",
       "                >"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"test\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Augmentation "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Triage Prompts\n",
    "\n",
    "1. Take the prompts list order by frequency\n",
    "2. Define regex patterns for prompt and constraint\n",
    "3. Generate prompts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_rep = ds.groupby([\"prompt\", \"split\"]).size().reset_index().rename(columns={0: \"records\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_rep = df_rep[df_rep[\"records\"] > 20].sort_values([\"records\"], ascending=False)\n",
    "# _str = df_rep[df_rep['records']>20].sort_values(['records'], ascending=False).iloc[1,0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "topPrompts20Reps = df_rep[df_rep[\"records\"] > 20].sort_values([\"records\"], ascending=False)[\"prompt\"].tolist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['[ WP ] Write the letter that you always wanted to , but never did .\\n',\n",
       " \"[ WP ] There is no prompt . Just write a story you 've always been thinking about or one you 've been thinking about sharing . Anything goes .\\n\",\n",
       " \"[ WP ] This is the prologue ( or the first chapter ) of the novel you 've always wanted to write .\\n\",\n",
       " '[ WP ] Write a short story where the first sentence has 20 words , 2nd sentence has 19 , 3rd has 18 etc . Story ends with a single word .\\n',\n",
       " \"[ WP ] Killing Hitler has become a sport amongst time travelers . Points are awarded for creativity and difficulty . You are last year 's champion , how did you win ?\\n\"]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "topPrompts20Reps[:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# df_rep[df_rep[\"split\"] == \"valid\"].iloc[1:3, 0]\n",
    "# topPrompts20Reps += df_rep[df_rep[\"split\"] == \"valid\"].iloc[1:3, 0].to_list()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "We found 1015 prompts having more than 20 stories\n"
     ]
    }
   ],
   "source": [
    "print(f\"We found {len(topPrompts20Reps)} prompts having more than 20 stories\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "PROMPT_PATTERNS = \"(Lucifer\\snever[\\s\\w,]+)|\\\n",
    "([\\. \\w,]+)\\.\\s+Tell me|\\\n",
    "(All injuries[\\. \\w,]+)\\.|\\\n",
    "(?<!\\])(At your[\\. \\w,]+)\\.|\\\n",
    "Daily Prompt \\: ([\\. \\w,]+)|\\\n",
    "In 100 words or less , ([\\. \\w,]+)\\.|\\\n",
    "(Last words/thoughts[\\. \\w,]+)\\.|\\\n",
    "(Magic is Hereditary.*) \\[|\\\n",
    "word limit (\\) [\\. \\w,\\/]+) \\.|\\\n",
    "(Make me love the person you love)|\\\n",
    "(Pack a punch) in 150 words|\\\n",
    "(The last man on earth[\\. \\w,\\/]+kill himself)|\\\n",
    "(The year is 2352 [\\. \\w,\\/'-]+)\\.|\\\n",
    "(A person dies[\\. \\w,\\/]+)\\.?|\\\n",
    "^[wW]rite a story([\\. \\w,\\/]+) |\\\n",
    "^[wW]rite about ([\\. \\w,\\/-]+)\\.?|\\\n",
    "^Writing Prompt (?:\\: [wW]rite|\\\n",
    "\\[ WP \\]) ([\\. \\w,\\/']+) ?|\\\n",
    "^(You 're a[\\. \\w,\\/']+)|\\\n",
    "(You 're moments[\\. \\w,\\/']+)\\.|\\\n",
    "(Describe the room you [\\. \\w\\/']+)|\\\n",
    " (Get me hooked \\. [ \\w,\\/']+)|\\\n",
    "[\\. \\w\\/',\\`]+ , (tell a horror story)|\\\n",
    "(Make me cry)|\\\n",
    "(Make me hate your character)|\\\n",
    "(Most responses on here have a twist[\\. \\w\\/',\\`;]+)|\\\n",
    "(Pick your favorite[\\(\\)\\. \\w\\/',\\`;]+beginning)|\\\n",
    "(Start your story[\\(\\)\\. \\w\\/',\\`;]+meanings \\.)|\\\n",
    "(The [\\. \\w\\/',\\`;]+ reader)|\\\n",
    "(Two people[\\. \\w,\\/']+bench)|\\\n",
    "Write (a gruesome story)|\\\n",
    "Write (a möb[\\. \\w,\\/']+story) that|\\\n",
    "(Write the letter [ ,\\w]+) |\\\n",
    "There is no prompt[ \\.\\w]+(you[ \\.\\w']+\\.)|\\\n",
    "(A peaceful alien race[ \\.\\w'-]+)\\.|\\\n",
    "(This is the prologue[\\(\\) \\.\\w'-]+)\\.|\\\n",
    "Write a short story where (the first[\\(\\) \\.\\w'-,]+)\\.|\\\n",
    "(Write the first and last paragraph[\\(\\) \\.\\w'-,]+)\\.|\\\n",
    "(Killing Hitler has[\\(\\) \\.\\w'-,\\?]+)|\\\n",
    "(You live in a city full[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "\\`\\` She said she loved him . [\\`'\\(\\) \\.\\w'-,\\?\\#]+\\.|\\\n",
    "(A soldier on the front dies[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You discover a grand hall[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(A boy asks a girl out . It 's high[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(When everyone turns 18 , they receive a pet[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(To get in Heaven , you have to [\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You are born without emotions [;\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You are a teenager with the ability[\\`;\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You live in a world where every person [\\`;\\/\\(\\) \\.\\w'-,\\?\\#]+)\"\n",
    "\n",
    "\n",
    "CONST_PATTERNS = \"Daily Prompt \\: [\\. \\w,]+\\[ ([\\. \\w,\\:]+)|\\\n",
    "(In 100 words or less) , ([\\. \\w,\\:]+) \\.|\\\n",
    "Make a story \\( ([\\. \\w,\\:]+) |\\\n",
    "Pack a punch (in 150 words)|\\\n",
    "Describe the room you [\\. \\w\\/']+([\\. \\w,\\:\\/]+)\\.|\\\n",
    "Get me hooked \\. Reel me in \\. ([\\. \\w\\/',\\`]+)\\.|\\\n",
    " ([\\. \\w\\/',\\`]+) , tell a horror story|\\\n",
    "Make me cry ([ \\w\\/',\\`]+).?|\\\n",
    "(in 150 words or less)|\\\n",
    "Pick your favorite[\\(\\)\\. \\w\\/',\\`;]+beginning \\. ([ \\w\\/',\\`]+)|\\\n",
    "Start your story[\\(\\)\\. \\w\\/',\\`;]+meanings \\.([ \\w\\/',\\`]+\\.)|\\\n",
    "The [\\. \\w\\/',\\`;]+ reader ,([\\. \\w\\/',\\`;]+)|\\\n",
    "Two people[\\. \\w,\\/']+bench \\. ([\\. \\w,\\:]+)|\\\n",
    "Write a gruesome story ([\\. \\w,\\:]+)|\\\n",
    "Write a möb[\\. \\w,\\/']+story (that[\\. \\w,\\/']+)\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add summary columns to data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install spacy -qqq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We aim to augment data as following:\n",
    "* Prompt: \n",
    "  * whole\n",
    "  * + constraints\n",
    "* Story:\n",
    "  * whole\n",
    "  * beginning\n",
    "  * middle - sliding window summarized\n",
    "  * end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Summarization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install transformers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "# @markdown utils\n",
    "from transformers.utils.logging import set_verbosity\n",
    "\n",
    "set_verbosity(40)\n",
    "\n",
    "import warnings\n",
    "\n",
    "# ignore hf pipeline complaints\n",
    "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"transformers\")\n",
    "warnings.filterwarnings(\"ignore\", category=FutureWarning, module=\"transformers\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from transformers import pipeline\n",
    "\n",
    "summarizer = pipeline(\n",
    "    \"summarization\",\n",
    "    \"pszemraj/long-t5-tglobal-base-16384-book-summary\",\n",
    "    device=0 if torch.cuda.is_available() else -1,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "params = {\n",
    "    \"max_length\": 1024,\n",
    "    \"min_length\": 8,\n",
    "    \"no_repeat_ngram_size\": 3,\n",
    "    \"early_stopping\": False,\n",
    "    \"repetition_penalty\": 3.5,\n",
    "    \"length_penalty\": 0.3,\n",
    "    \"encoder_no_repeat_ngram_size\": 3,\n",
    "    \"num_beams\": 4,\n",
    "}  # parameters for text generation out of model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Interpolation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import spacy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "\n",
    "import re\n",
    "\n",
    "\n",
    "def extract_prompt_parts(prompt, pattern):\n",
    "    \"\"\"\n",
    "    takes a prompt and some parts that matches to patern\n",
    "    \"\"\"\n",
    "    pattern = pattern.replace(\"\\\\\\n\", \"\\\\\")\n",
    "    if m := re.search(pattern, prompt, re.IGNORECASE):\n",
    "        if len(m.groups()) > 0:\n",
    "            return m.group(0)\n",
    "    return None\n",
    "\n",
    "\n",
    "from spacy.lang.en import English\n",
    "\n",
    "\n",
    "def get_sentences(_str):\n",
    "    chunks = _str.split(\"\\n\")\n",
    "    sentences = []\n",
    "    nlp = English()\n",
    "    nlp.add_pipe(\"sentencizer\")\n",
    "    for chunk in chunks:\n",
    "        doc = nlp(chunk)\n",
    "        sentences += [sent.text.strip() for sent in doc.sents]\n",
    "    return sentences\n",
    "\n",
    "\n",
    "from itertools import islice\n",
    "\n",
    "\n",
    "def window(seq, n=2):\n",
    "    it = iter(seq)\n",
    "    result = tuple(islice(it, n))\n",
    "    if len(result) == n:\n",
    "        yield \" \".join(result)\n",
    "    for elem in it:\n",
    "        result = result[1:] + (elem,)\n",
    "        yield \" \".join(result)\n",
    "\n",
    "\n",
    "def extract_story_parts(story):\n",
    "    sentences = get_sentences(story)\n",
    "    beginning = sentences.pop(0)\n",
    "    middles = window(sentences, 4)\n",
    "    ending = sentences.pop(-1)\n",
    "    return beginning, middles, ending\n",
    "\n",
    "\n",
    "def clear_prompt(prompt):\n",
    "    return re.sub(r\"^[Ww]rite \", \"\", prompt)\n",
    "\n",
    "\n",
    "def get_sample_dict(split, id, text):\n",
    "    return {\"split\": split, \"splitLineIndex\": id, \"text\": text}\n",
    "\n",
    "\n",
    "def generate_instruction_diologs(df):\n",
    "    dialogs = []\n",
    "    \"\"\"User: What is this story about: {story} -> Rosey: I think it's about: {striped_prompt}\"\"\"\n",
    "    dialogBase = \"\"\"User: write me a story about: {stripped_prompt}\"\"\"\n",
    "    dialog1 = \"\"\" -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\"\"\"\n",
    "    dialog2 = \"\"\", {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\"\"\"\n",
    "    dialog3 = \"\"\", starting with: {beginning} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beginning}:\\n{story}\"\"\"\n",
    "    dialog4 = \"\"\", ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\"\"\"\n",
    "    dialog5 = \"\"\", where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\"\"\"\n",
    "\n",
    "    df_rep = df.groupby([\"prompt\"]).size().reset_index().rename(columns={0: \"records\"})\n",
    "    df_rep.sort_values([\"records\"], ascending=False, inplace=True)\n",
    "    pbar = tqdm()\n",
    "    pbar.reset(total=len(df_rep))\n",
    "    for prompt in df_rep.iloc[:, 0]:\n",
    "        strippedPrompt = extract_prompt_parts(prompt, PROMPT_PATTERNS)\n",
    "        if strippedPrompt is None:\n",
    "            continue\n",
    "        strippedPrompt = clear_prompt(strippedPrompt)\n",
    "        strippedConstraint = extract_prompt_parts(prompt, CONST_PATTERNS)\n",
    "\n",
    "        for row in df[df[\"prompt\"] == prompt].itertuples():\n",
    "            try:\n",
    "                story = (\n",
    "                    row.story.replace(\"<newline>\", \"\\n\")\n",
    "                    .replace(\"< newline >\", \"\\n\")\n",
    "                    .replace(\"<new line>\", \"\\n\")\n",
    "                    .strip()\n",
    "                )\n",
    "                beginning, middles, ending = extract_story_parts(story)\n",
    "                dialogBeg = dialogBase.format(stripped_prompt=strippedPrompt)\n",
    "                dialog = dialogBeg + dialog1.format(story=story, stripped_prompt=strippedPrompt)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                if strippedConstraint is not None:\n",
    "                    dialog = dialogBeg + dialog2.format(\n",
    "                        stripped_prompt=strippedPrompt, stripped_constraint=strippedConstraint, story=story\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                dialog = dialogBeg + dialog3.format(stripped_prompt=strippedPrompt, story=story, beginning=beginning)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                dialog = dialogBeg + dialog4.format(stripped_prompt=strippedPrompt, story=story, ending=ending)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                middlesSumarizered = summarizer(middles, **params)\n",
    "                for middle, sumarizedMiddle in zip(middles, middlesSumarizered):\n",
    "                    # dialogs.append(dialogBeg + dialog5.format(stripped_prompt=strippedPrompt, story=story, middle=middle))\n",
    "                    dialog = dialogBeg + dialog5.format(\n",
    "                        stripped_prompt=strippedPrompt, story=story, middle=sumarizedMiddle[0][\"summary_text\"]\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                pbar.update()\n",
    "            except Exception as e:\n",
    "                print(f\"{row.split}/{row.splitIndex}\")\n",
    "                raise e\n",
    "        pbar.refresh()\n",
    "    return dialogs\n",
    "\n",
    "\n",
    "def filter_data(\n",
    "    dataset,\n",
    "    negativeTagFilter=None,\n",
    "    positiveTagFilter=None,\n",
    "    patternFilter=None,\n",
    "):\n",
    "    \"\"\"\n",
    "    > filter_data(dataset['train'],negativeTagFilter=['ip'], positiveTagFilter=['pm'] )\n",
    "    \"\"\"\n",
    "    prompt = dataset[\"prompt\"]\n",
    "    if negativeTagFilter is not None:\n",
    "        prompt = prompt[(prompt[negativeTagFilter] < 1).any(axis=1)]\n",
    "    if positiveTagFilter is not None:\n",
    "        prompt = prompt[prompt[positiveTagFilter].gt(0).all(axis=1)]\n",
    "    if patternFilter is not None:\n",
    "        prompt = prompt[prompt[\"prompt\"].str.contains(patternFilter)]\n",
    "    story = dataset[\"story\"]\n",
    "    story = story.iloc[prompt.index]\n",
    "    return {\"prompt\": prompt, \"story\": story}\n",
    "\n",
    "\n",
    "def generate_instruction_diologs(prompt, df):\n",
    "    dialogs = []\n",
    "    \"\"\"User: What is this story about: {story} -> Rosey: I think it's about: {striped_prompt}\"\"\"\n",
    "    dialogBase = \"\"\"User: write me a story about: {stripped_prompt}\"\"\"\n",
    "    dialog1 = \"\"\" -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\"\"\"\n",
    "    dialog2 = \"\"\", {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\"\"\"\n",
    "    dialog3 = \"\"\", starting with: {beginning} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beginning}:\\n{story}\"\"\"\n",
    "    dialog4 = \"\"\", ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\"\"\"\n",
    "    dialog5 = \"\"\", where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\"\"\"\n",
    "\n",
    "    strippedPrompt = extract_prompt_parts(prompt, PROMPT_PATTERNS)\n",
    "    if strippedPrompt is not None:\n",
    "        strippedPrompt = clear_prompt(strippedPrompt)\n",
    "        strippedConstraint = extract_prompt_parts(prompt, CONST_PATTERNS)\n",
    "        pbar = tqdm(ascii=True, desc=\"stories\")\n",
    "        pbar.reset(total=len(df[df[\"prompt\"] == prompt]))\n",
    "        for row in df[df[\"prompt\"] == prompt].itertuples():\n",
    "            try:\n",
    "                story = (\n",
    "                    row.story.replace(\"<newline>\", \"\\n\")\n",
    "                    .replace(\"< newline >\", \"\\n\")\n",
    "                    .replace(\"<new line>\", \"\\n\")\n",
    "                    .strip()\n",
    "                )\n",
    "                dialogBeg = dialogBase.format(stripped_prompt=strippedPrompt)\n",
    "                dialog = dialogBeg + dialog1.format(story=story, stripped_prompt=strippedPrompt)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                if strippedConstraint is not None:\n",
    "                    dialog = dialogBeg + dialog2.format(\n",
    "                        stripped_prompt=strippedPrompt, stripped_constraint=strippedConstraint, story=story\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                beginning, middles, ending = extract_story_parts(story)\n",
    "                if beginning is not None:\n",
    "                    beginning, middles, ending = extract_story_parts(story)\n",
    "                    dialog = dialogBeg + dialog3.format(\n",
    "                        stripped_prompt=strippedPrompt, story=story, beginning=beginning\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                    dialog = dialogBeg + dialog4.format(stripped_prompt=strippedPrompt, story=story, ending=ending)\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                    middlesSumarizered = summarizer(middles, **params)\n",
    "                    for middle, sumarizedMiddle in zip(middles, middlesSumarizered):\n",
    "                        # dialogs.append(dialogBeg + dialog5.format(stripped_prompt=strippedPrompt, story=story, middle=middle))\n",
    "                        dialog = dialogBeg + dialog5.format(\n",
    "                            stripped_prompt=strippedPrompt, story=story, middle=sumarizedMiddle[0][\"summary_text\"]\n",
    "                        )\n",
    "                        dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                pbar.update()\n",
    "            except Exception as e:\n",
    "                print(f\"{row.split}/{row.splitLineIndex}\")\n",
    "                raise e\n",
    "            pbar.refresh()\n",
    "    return dialogs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generate "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It saves parquet every `step` samples to avoid losing work. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## filter dataset to take only prompts with frequency greater than 20 stories.\n",
    "dialogs = []\n",
    "i = 0\n",
    "start = 0\n",
    "step = 10\n",
    "for index in range(start, len(topPrompts20Reps), step):\n",
    "    pbar = tqdm(ascii=True, desc=\"prompt\")\n",
    "    pbar.reset(total=len(topPrompts20Reps[index : index + step]))\n",
    "    for prompt in topPrompts20Reps[index : index + step]:\n",
    "        tmpDialogs = generate_instruction_diologs(prompt, ds)\n",
    "        if tmpDialogs is not None:\n",
    "            dialogs += tmpDialogs\n",
    "        pbar.update()\n",
    "    if len(dialogs) > 0:\n",
    "        pd.DataFrame(dialogs).to_parquet(\"writing-prompts-aug.parquet\")\n",
    "    pbar.refresh()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_parquet(\"writing-prompts-aug.parquet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for split in list(set(df.split)):\n",
    "    df_aux = df[df[\"split\"] == split].iloc[:, 1:]\n",
    "    df_aux.reset_index(inplace=True)\n",
    "    df_aux.iloc[:, 1:].to_parquet(f\"{split}.parquet\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "fastai22",
   "language": "python",
   "name": "fastai22"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "01073391c27d455898ddec5e5b613840": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "02aff4fac4454967b80469f0774e1a6c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "03209aedabd94b9f97c7ff186d61a1b5": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "03c75c2c3a674154aa1370081c8d2d0c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "04eee7ef7947484c9a2fb9bb6ff14eec": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "04f0d4dafcee402780ab34cfba03179e": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_7390362a9704413984a47a1d5b262276",
       "IPY_MODEL_b263a25a96f547218983b9e62f2b841c",
       "IPY_MODEL_f8ac6fc3cf284b50bb54c6ade26db5a1"
      ],
      "layout": "IPY_MODEL_aca1b6be80124fd0999892577aee9f1e"
     }
    },
    "05cf82d369674d848d9d2dd50be546ad": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_949e1ca0688f4df39c6f0aee139a8a4b",
      "placeholder": "​",
      "style": "IPY_MODEL_666605f8ef614cc5806b7e2076095746",
      "value": " 27%"
     }
    },
    "09ad8cfb26814f979a82ac73f073d5c2": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_903c2a0ea90043d5ab9c6812ee118c1a",
       "IPY_MODEL_664e94791b1946e1a78bfa93e9ce0b6f",
       "IPY_MODEL_322330e98fc745df9b55a959392c015c"
      ],
      "layout": "IPY_MODEL_02aff4fac4454967b80469f0774e1a6c"
     }
    },
    "0ae446f572cd4bc5b6ac64e5f1aff216": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "0bba8f8e7f754d1eb204db2ceab4aaab": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "0d209a94698d43748bccb06629b1c97a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "14f3ee8a6fa943178e949c45baa7683f": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "161a4ed9fcd04fee984704a6666f5399": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1924812f3b644648ae3671cb1f8f659f": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1a774659596145c48dfd1703664ffbaa": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1c9c8492343e4a86b3977b41abf2c91c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "20beb9b7ad504afba558ed28b6fb242b": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_9a369f1da9f94552960bb42bc895fb4a",
       "IPY_MODEL_4f372c13f77245c49925981c33d1d611",
       "IPY_MODEL_d0381de0ca3a4359a0d2c393e9f64f69"
      ],
      "layout": "IPY_MODEL_c50b53014ab44ef4b196b1a79c1ad61c"
     }
    },
    "2102cf1d8f6b4192b6d45dfbe4e5044d": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "227455685ef746a4845020529c86aca2": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "28085d8a3b4341e5bac2ce7efd9d89d5": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_92fb3795816548ffb336749cf590d335",
      "max": 2422362,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_3402b3a652254e90b3d6ef17dccfe90a",
      "value": 2422362
     }
    },
    "2e3b3d799b5b461d91fb4b2fa64ea7be": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "322330e98fc745df9b55a959392c015c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_7c9fbbe9addd4d4a82a0e7f2a9410af2",
      "placeholder": "​",
      "style": "IPY_MODEL_c4b3a987b2eb4d81a209fe62f1f00459",
      "value": " 792k/792k [00:00&lt;00:00, 10.6MB/s]"
     }
    },
    "33fc1d4498574c1e86e7c336ab3c4a9d": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "3402b3a652254e90b3d6ef17dccfe90a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "3612d9a6e93348d6b7b98ca7d611eec4": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_1924812f3b644648ae3671cb1f8f659f",
      "placeholder": "​",
      "style": "IPY_MODEL_0bba8f8e7f754d1eb204db2ceab4aaab",
      "value": "Downloading (…)lve/main/config.json: 100%"
     }
    },
    "3dce9197ad544ff2be48248640298d38": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "43b109811c7d42089713ad5c327afc9d": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "43f709c83c424926b92e36acc3c95e1a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "473e0749bada493b90253b7c0a816e59": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "4b9d6ee49ebd4c018d01f8a64fb112e1": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_05cf82d369674d848d9d2dd50be546ad",
       "IPY_MODEL_ce7a65dfe8a04e29b8512044fe994b87",
       "IPY_MODEL_e639d6f2dafd4897a9a5df658cdf68b0"
      ],
      "layout": "IPY_MODEL_0d209a94698d43748bccb06629b1c97a"
     }
    },
    "4cfd63abaee74a1babed15ecc1ee834a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "4f372c13f77245c49925981c33d1d611": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_8b28758bfe16428ca933c9100b7a8b29",
      "max": 2361,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_983d2b1c0515441db135aae6dd217c41",
      "value": 2361
     }
    },
    "5d2a8c459bfc4e24be21ceef1ec86ae0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "62113a2cac0d499b9acf2a89f1993f9a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "664e94791b1946e1a78bfa93e9ce0b6f": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_2102cf1d8f6b4192b6d45dfbe4e5044d",
      "max": 791656,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_d07b4780b79340c8950e3f12c4d70820",
      "value": 791656
     }
    },
    "666605f8ef614cc5806b7e2076095746": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "7390362a9704413984a47a1d5b262276": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c3f48d9f38b8419aae37d33b4968c2f5",
      "placeholder": "​",
      "style": "IPY_MODEL_814a8be16bbd4c499b23e931155c6169",
      "value": "Downloading (…)cial_tokens_map.json: 100%"
     }
    },
    "78188eb50348434e92dc947f6baae899": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_e33fd695d0af44dcb068cd168190ec03",
      "max": 1125,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_cf90c2cb43ae481baa3ef13417b1fc4b",
      "value": 1125
     }
    },
    "7c9fbbe9addd4d4a82a0e7f2a9410af2": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "7d29075f6e25436cb7fa531b4f1b92f0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_03209aedabd94b9f97c7ff186d61a1b5",
      "placeholder": "​",
      "style": "IPY_MODEL_43b109811c7d42089713ad5c327afc9d",
      "value": " 1.12k/1.12k [00:00&lt;00:00, 39.0kB/s]"
     }
    },
    "7fa486f7cf6e41668382b57979928ecd": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "814a8be16bbd4c499b23e931155c6169": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "818222eaa6d64018b9058bcf6531b658": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_2e3b3d799b5b461d91fb4b2fa64ea7be",
      "placeholder": "​",
      "style": "IPY_MODEL_c41b65885a7b46d8b205b7db8e123cf4",
      "value": "Downloading (…)/main/tokenizer.json: 100%"
     }
    },
    "83ad5f094e684a33b03a28fb7b54f1cc": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_818222eaa6d64018b9058bcf6531b658",
       "IPY_MODEL_28085d8a3b4341e5bac2ce7efd9d89d5",
       "IPY_MODEL_ca32d31fb99e4b5990ba6fd33d3e1915"
      ],
      "layout": "IPY_MODEL_62113a2cac0d499b9acf2a89f1993f9a"
     }
    },
    "865eeaa12f9d4ecbb5e38b2b3baaa4cd": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_feefc865187648db9956cafc5914b123",
       "IPY_MODEL_eb27214d49314527aa99ab65e62ac529",
       "IPY_MODEL_a87f3e961e0d486d81bebec195b396a5"
      ],
      "layout": "IPY_MODEL_01073391c27d455898ddec5e5b613840"
     }
    },
    "876761d92c4a47558604f24826fbf276": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "88b58ed1580c4cf195963010c20d5454": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "88c37802c3914ae6ab3e2cff32cfbe87": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_3612d9a6e93348d6b7b98ca7d611eec4",
       "IPY_MODEL_78188eb50348434e92dc947f6baae899",
       "IPY_MODEL_7d29075f6e25436cb7fa531b4f1b92f0"
      ],
      "layout": "IPY_MODEL_88b58ed1580c4cf195963010c20d5454"
     }
    },
    "8b28758bfe16428ca933c9100b7a8b29": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "8bc9ac5c49a445e5b341513efaf58a83": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "903c2a0ea90043d5ab9c6812ee118c1a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_1a774659596145c48dfd1703664ffbaa",
      "placeholder": "​",
      "style": "IPY_MODEL_1c9c8492343e4a86b3977b41abf2c91c",
      "value": "Downloading (…)&quot;spiece.model&quot;;: 100%"
     }
    },
    "92fb3795816548ffb336749cf590d335": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "949e1ca0688f4df39c6f0aee139a8a4b": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "983d2b1c0515441db135aae6dd217c41": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "9a369f1da9f94552960bb42bc895fb4a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_227455685ef746a4845020529c86aca2",
      "placeholder": "​",
      "style": "IPY_MODEL_d65137c7ad444b38a2b8fcd1d36c1528",
      "value": "Downloading (…)okenizer_config.json: 100%"
     }
    },
    "a1e32f35ab1c4014aa6903ef043b469c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "a87f3e961e0d486d81bebec195b396a5": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_876761d92c4a47558604f24826fbf276",
      "placeholder": "​",
      "style": "IPY_MODEL_33fc1d4498574c1e86e7c336ab3c4a9d",
      "value": " 990M/990M [00:28&lt;00:00, 32.0MB/s]"
     }
    },
    "aca1b6be80124fd0999892577aee9f1e": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "b263a25a96f547218983b9e62f2b841c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_43f709c83c424926b92e36acc3c95e1a",
      "max": 2201,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_3dce9197ad544ff2be48248640298d38",
      "value": 2201
     }
    },
    "c3f48d9f38b8419aae37d33b4968c2f5": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c41b65885a7b46d8b205b7db8e123cf4": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "c4b3a987b2eb4d81a209fe62f1f00459": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "c50b53014ab44ef4b196b1a79c1ad61c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c6249ce38c8f437f9234faa7081743d4": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c6da8ecbbf374f0d84e8704546a30c27": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "ca32d31fb99e4b5990ba6fd33d3e1915": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_4cfd63abaee74a1babed15ecc1ee834a",
      "placeholder": "​",
      "style": "IPY_MODEL_14f3ee8a6fa943178e949c45baa7683f",
      "value": " 2.42M/2.42M [00:01&lt;00:00, 1.53MB/s]"
     }
    },
    "ce7a65dfe8a04e29b8512044fe994b87": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_03c75c2c3a674154aa1370081c8d2d0c",
      "max": 1016,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_473e0749bada493b90253b7c0a816e59",
      "value": 274
     }
    },
    "cf90c2cb43ae481baa3ef13417b1fc4b": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "d0381de0ca3a4359a0d2c393e9f64f69": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_ddf56a6653304256bb61c8b69710fbec",
      "placeholder": "​",
      "style": "IPY_MODEL_0ae446f572cd4bc5b6ac64e5f1aff216",
      "value": " 2.36k/2.36k [00:00&lt;00:00, 111kB/s]"
     }
    },
    "d07b4780b79340c8950e3f12c4d70820": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "d65137c7ad444b38a2b8fcd1d36c1528": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "ddf56a6653304256bb61c8b69710fbec": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "e33fd695d0af44dcb068cd168190ec03": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "e639d6f2dafd4897a9a5df658cdf68b0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_161a4ed9fcd04fee984704a6666f5399",
      "placeholder": "​",
      "style": "IPY_MODEL_8bc9ac5c49a445e5b341513efaf58a83",
      "value": " 273/1016 [2:01:10&lt;8:24:02, 40.70s/it]"
     }
    },
    "eb27214d49314527aa99ab65e62ac529": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_04eee7ef7947484c9a2fb9bb6ff14eec",
      "max": 990446387,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_a1e32f35ab1c4014aa6903ef043b469c",
      "value": 990446387
     }
    },
    "f8ac6fc3cf284b50bb54c6ade26db5a1": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c6249ce38c8f437f9234faa7081743d4",
      "placeholder": "​",
      "style": "IPY_MODEL_5d2a8c459bfc4e24be21ceef1ec86ae0",
      "value": " 2.20k/2.20k [00:00&lt;00:00, 119kB/s]"
     }
    },
    "feefc865187648db9956cafc5914b123": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c6da8ecbbf374f0d84e8704546a30c27",
      "placeholder": "​",
      "style": "IPY_MODEL_7fa486f7cf6e41668382b57979928ecd",
      "value": "Downloading (…)&quot;pytorch_model.bin&quot;;: 100%"
     }
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
