{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Text Generation\n",
    "\n",
    "In this notebooks, we train an adapter for **GPT-2** that performs **poem generation**. We use a dataset of poem verses extracted from Project Gutenberg that is [available via HuggingFace datasets](https://huggingface.co/datasets/poem_sentiment).\n",
    "\n",
    "First, let's install all required libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "91hZktTHnvz6",
    "outputId": "890bb8a8-1c99-4ad9-aa6c-38b81a7a3ee3"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Collecting git+https://github.com/hSterz/adapter-transformers.git@notebooks\n",
      "  Cloning https://github.com/hSterz/adapter-transformers.git (to revision notebooks) to /tmp/pip-req-build-yq66v6vw\n",
      "  Running command git clone -q https://github.com/hSterz/adapter-transformers.git /tmp/pip-req-build-yq66v6vw\n",
      "  Running command git checkout -b notebooks --track origin/notebooks\n",
      "  Switched to a new branch 'notebooks'\n",
      "  Branch 'notebooks' set up to track remote branch 'notebooks' from 'origin'.\n",
      "  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
      "  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
      "    Preparing wheel metadata ... \u001b[?25l\u001b[?25hdone\n",
      "Requirement already satisfied, skipping upgrade: tqdm>=4.27 in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (4.41.1)\n",
      "Requirement already satisfied, skipping upgrade: importlib-metadata; python_version < \"3.8\" in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (3.7.2)\n",
      "Requirement already satisfied, skipping upgrade: filelock in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (3.0.12)\n",
      "Requirement already satisfied, skipping upgrade: numpy>=1.17 in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (1.19.5)\n",
      "Requirement already satisfied, skipping upgrade: sacremoses in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (0.0.43)\n",
      "Requirement already satisfied, skipping upgrade: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (2019.12.20)\n",
      "Requirement already satisfied, skipping upgrade: tokenizers<0.11,>=0.10.1 in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (0.10.1)\n",
      "Requirement already satisfied, skipping upgrade: packaging in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (20.9)\n",
      "Requirement already satisfied, skipping upgrade: requests in /usr/local/lib/python3.7/dist-packages (from adapter-transformers==2.0.0a1) (2.23.0)\n",
      "Requirement already satisfied, skipping upgrade: typing-extensions>=3.6.4; python_version < \"3.8\" in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < \"3.8\"->adapter-transformers==2.0.0a1) (3.7.4.3)\n",
      "Requirement already satisfied, skipping upgrade: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < \"3.8\"->adapter-transformers==2.0.0a1) (3.4.1)\n",
      "Requirement already satisfied, skipping upgrade: joblib in /usr/local/lib/python3.7/dist-packages (from sacremoses->adapter-transformers==2.0.0a1) (1.0.1)\n",
      "Requirement already satisfied, skipping upgrade: six in /usr/local/lib/python3.7/dist-packages (from sacremoses->adapter-transformers==2.0.0a1) (1.15.0)\n",
      "Requirement already satisfied, skipping upgrade: click in /usr/local/lib/python3.7/dist-packages (from sacremoses->adapter-transformers==2.0.0a1) (7.1.2)\n",
      "Requirement already satisfied, skipping upgrade: pyparsing>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging->adapter-transformers==2.0.0a1) (2.4.7)\n",
      "Requirement already satisfied, skipping upgrade: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->adapter-transformers==2.0.0a1) (3.0.4)\n",
      "Requirement already satisfied, skipping upgrade: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->adapter-transformers==2.0.0a1) (2.10)\n",
      "Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->adapter-transformers==2.0.0a1) (2020.12.5)\n",
      "Requirement already satisfied, skipping upgrade: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->adapter-transformers==2.0.0a1) (1.24.3)\n",
      "Building wheels for collected packages: adapter-transformers\n",
      "  Building wheel for adapter-transformers (PEP 517) ... \u001b[?25l\u001b[?25hdone\n",
      "  Created wheel for adapter-transformers: filename=adapter_transformers-2.0.0a1-cp37-none-any.whl size=2009422 sha256=191637b46cc2556c1eda5f1920af739aa96a3633cd569252f9e13a163a67259b\n",
      "  Stored in directory: /tmp/pip-ephem-wheel-cache-hziwqfh6/wheels/e2/39/27/3dcf47f4b1bf14923684680dc1f44c2e935944359f16487c86\n",
      "Successfully built adapter-transformers\n",
      "Installing collected packages: adapter-transformers\n",
      "  Found existing installation: adapter-transformers 2.0.0a1\n",
      "    Uninstalling adapter-transformers-2.0.0a1:\n",
      "      Successfully uninstalled adapter-transformers-2.0.0a1\n",
      "Successfully installed adapter-transformers-2.0.0a1\n",
      "Requirement already satisfied: datasets in /usr/local/lib/python3.7/dist-packages (1.5.0)\n",
      "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (2.23.0)\n",
      "Requirement already satisfied: importlib-metadata; python_version < \"3.8\" in /usr/local/lib/python3.7/dist-packages (from datasets) (3.7.2)\n",
      "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.7/dist-packages (from datasets) (1.19.5)\n",
      "Requirement already satisfied: dill in /usr/local/lib/python3.7/dist-packages (from datasets) (0.3.3)\n",
      "Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from datasets) (1.1.5)\n",
      "Requirement already satisfied: pyarrow>=0.17.1 in /usr/local/lib/python3.7/dist-packages (from datasets) (3.0.0)\n",
      "Requirement already satisfied: tqdm<4.50.0,>=4.27 in /usr/local/lib/python3.7/dist-packages (from datasets) (4.41.1)\n",
      "Requirement already satisfied: xxhash in /usr/local/lib/python3.7/dist-packages (from datasets) (2.0.0)\n",
      "Requirement already satisfied: multiprocess in /usr/local/lib/python3.7/dist-packages (from datasets) (0.70.11.1)\n",
      "Requirement already satisfied: fsspec in /usr/local/lib/python3.7/dist-packages (from datasets) (0.8.7)\n",
      "Requirement already satisfied: huggingface-hub<0.1.0 in /usr/local/lib/python3.7/dist-packages (from datasets) (0.0.7)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (2020.12.5)\n",
      "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (1.24.3)\n",
      "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (3.0.4)\n",
      "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->datasets) (2.10)\n",
      "Requirement already satisfied: typing-extensions>=3.6.4; python_version < \"3.8\" in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < \"3.8\"->datasets) (3.7.4.3)\n",
      "Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < \"3.8\"->datasets) (3.4.1)\n",
      "Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->datasets) (2.8.1)\n",
      "Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas->datasets) (2018.9)\n",
      "Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from huggingface-hub<0.1.0->datasets) (3.0.12)\n",
      "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas->datasets) (1.15.0)\n"
     ]
    }
   ],
   "source": [
    "!pip install -U adapter-transformers\n",
    "!pip install datasets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Fe_W__U6uFBs"
   },
   "source": [
    "Next, we need to download the dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "zBuzA0h3n4-z",
    "outputId": "aded08bc-6647-43d9-c2e3-0bdd831e3a33"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using custom data configuration default\n",
      "Reusing dataset poem_sentiment (/root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DatasetDict({\n",
      "    train: Dataset({\n",
      "        features: ['id', 'verse_text', 'label'],\n",
      "        num_rows: 892\n",
      "    })\n",
      "    validation: Dataset({\n",
      "        features: ['id', 'verse_text', 'label'],\n",
      "        num_rows: 105\n",
      "    })\n",
      "    test: Dataset({\n",
      "        features: ['id', 'verse_text', 'label'],\n",
      "        num_rows: 104\n",
      "    })\n",
      "})\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "dataset = load_dataset(\"poem_sentiment\")\n",
    "print(dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "neZt29NKuT11"
   },
   "source": [
    "Before training, we need to preprocess the dataset. We tokenize the entries in the dataset and remove all columns we don't need to train the adapter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "d3Lz6h9Zo9t0",
    "outputId": "9fdb9f57-6aaf-4d09-aa78-07e4049029f6"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-d6ab54fa3fbc78bd.arrow\n",
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-6b5d3f18db518d3d.arrow\n",
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-5f751c4a95f6b033.arrow\n"
     ]
    }
   ],
   "source": [
    "from transformers import GPT2Tokenizer\n",
    "\n",
    "def encode_batch(batch):\n",
    "  \"\"\"Encodes a batch of input data using the model tokenizer.\"\"\"\n",
    "  encoding = tokenizer(batch[\"verse_text\"])\n",
    "  # For language modeling the labels need to be the input_ids\n",
    "  #encoding[\"labels\"] = encoding[\"input_ids\"]\n",
    "  return encoding\n",
    "\n",
    "tokenizer = GPT2Tokenizer.from_pretrained(\"gpt2\")\n",
    "# The GPT-2 tokenizer does not have a padding token. In order to process the data \n",
    "# in batches we set one here \n",
    "tokenizer.pad_token = tokenizer.eos_token\n",
    "column_names = dataset[\"train\"].column_names\n",
    "dataset = dataset.map(encode_batch, remove_columns=column_names, batched=True)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jqJWsH5_ENkb"
   },
   "source": [
    "Next, we concatenate the documents in the dataset and create chunks with a length of `block_size`. This is beneficial for language modeling."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "3ff5lRmnA6R7",
    "outputId": "afc8b3c0-d70d-4360-f24f-6bffba720957"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-3f8369251f49a308.arrow\n",
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-6f7b971f033f4873.arrow\n",
      "Loading cached processed dataset at /root/.cache/huggingface/datasets/poem_sentiment/default/1.0.0/f4990808f049126bcea572bba70613313212cd45f3b12a3e5586135e2de42f56/cache-5b3fb38a938eecda.arrow\n"
     ]
    }
   ],
   "source": [
    "block_size = 50\n",
    "# Main data processing function that will concatenate all texts from our dataset and generate chunks of block_size.\n",
    "def group_texts(examples):\n",
    "  # Concatenate all texts.\n",
    "  concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}\n",
    "  total_length = len(concatenated_examples[list(examples.keys())[0]])\n",
    "  # We drop the small remainder, we could add padding if the model supported it instead of this drop, you can\n",
    "  # customize this part to your needs.\n",
    "  total_length = (total_length // block_size) * block_size\n",
    "  # Split by chunks of max_len.\n",
    "  result = {\n",
    "    k: [t[i : i + block_size] for i in range(0, total_length, block_size)]\n",
    "    for k, t in concatenated_examples.items()\n",
    "  }\n",
    "  result[\"labels\"] = result[\"input_ids\"].copy()\n",
    "  return result\n",
    "\n",
    "dataset = dataset.map(group_texts,batched=True,)\n",
    "\n",
    "dataset.set_format(type=\"torch\", columns=[\"input_ids\", \"attention_mask\", \"labels\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AT56RfOGFXvt"
   },
   "source": [
    "Next, we create the model and add our new adapter. Let's just call it `poem` since it is trained to create new poems. Then we activate it and prepare it for training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "ioLpFbOfnPE6"
   },
   "outputs": [],
   "source": [
    "from transformers import AutoModelForCausalLM\n",
    "\n",
    "model = AutoModelForCausalLM.from_pretrained(\"gpt2\")\n",
    "# add new adapter\n",
    "model.add_adapter(\"poem\")\n",
    "# activate adapter for training\n",
    "model.train_adapter(\"poem\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vkCUz8B6Fw5E"
   },
   "source": [
    "The last thing we need to do before we can start training is creating the trainer. As trainings arguments, we choose a learning rate of 1e-4. Feel free to play around with the parameters and see how they affect the result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "stDwnIEApmNu"
   },
   "outputs": [],
   "source": [
    "from transformers import AdapterTrainer, TrainingArguments\n",
    "training_args = TrainingArguments(\n",
    "  output_dir=\"./examples\", \n",
    "  do_train=True,\n",
    "  remove_unused_columns=False,\n",
    "  learning_rate=5e-4,\n",
    "  num_train_epochs=3,\n",
    ")\n",
    "\n",
    "\n",
    "trainer = AdapterTrainer(\n",
    "        model=model,\n",
    "        args=training_args,\n",
    "        tokenizer=tokenizer,\n",
    "        train_dataset=dataset[\"train\"],\n",
    "        eval_dataset=dataset[\"validation\"], \n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 114
    },
    "id": "Mtjczmcxqv-l",
    "outputId": "1c386f87-f5cd-4791-d4de-4fd8efe74583"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "        <style>\n",
       "            /* Turns off some styling */\n",
       "            progress {\n",
       "                /* gets rid of default border in Firefox and Opera. */\n",
       "                border: none;\n",
       "                /* Needs to be in here for Safari polyfill so background images work as expected. */\n",
       "                background-size: auto;\n",
       "            }\n",
       "        </style>\n",
       "      \n",
       "      <progress value='66' max='66' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      [66/66 04:19, Epoch 3/3]\n",
       "    </div>\n",
       "    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: left;\">\n",
       "      <th>Step</th>\n",
       "      <th>Training Loss</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "  </tbody>\n",
       "</table><p>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {
      "tags": []
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "TrainOutput(global_step=66, training_loss=5.653754956794508, metrics={'train_runtime': 264.1691, 'train_samples_per_second': 0.25, 'total_flos': 19852958822400.0, 'epoch': 3.0, 'init_mem_cpu_alloc_delta': 292637, 'init_mem_cpu_peaked_delta': 14018, 'train_mem_cpu_alloc_delta': 609432, 'train_mem_cpu_peaked_delta': 259598})"
      ]
     },
     "execution_count": 7,
     "metadata": {
      "tags": []
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trainer.train()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3pkNcsRzGCQo"
   },
   "source": [
    "Now that we have a trained adapter we save it for future usage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "id": "_Q_pKmUkqx7P"
   },
   "outputs": [],
   "source": [
    "model.save_adapter(\"adapter_poem\", \"poem\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hjcY1WnaGIxi"
   },
   "source": [
    "Next, let's generate some poetry with our trained adapter. In order to do this, we create a GPT2LMHeadModel that is best suited for language generation. Then we load our trained adapter. Finally, we have to choose the start of our poem. If you want your poem to start differently just change `PREFIX` accordingly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "id": "m4TML7t2mZRp"
   },
   "outputs": [],
   "source": [
    "from transformers import GPT2LMHeadModel, GPT2Tokenizer\n",
    "\n",
    "model = GPT2LMHeadModel.from_pretrained(\"gpt2\")\n",
    "# You can also load your locally trained adapter\n",
    "model.load_adapter(\"adapter_poem\")\n",
    "model.set_active_adapters(\"poem\")\n",
    "\n",
    "PREFIX = \"In the night\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "I0tFdm1RGtma"
   },
   "source": [
    "For the generation, we need to tokenize the prefix first and then pass it to the model. In this case, we create five possible continuations for the beginning we chose."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "h6iiEOwfsdVd",
    "outputId": "eac599d2-2ddf-4d49-e550-6ad7ad14c411"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    }
   ],
   "source": [
    "encoding = tokenizer(PREFIX, return_tensors=\"pt\")\n",
    "output_sequence = model.generate(\n",
    "  input_ids=encoding[\"input_ids\"],\n",
    "  attention_mask=encoding[\"attention_mask\"],\n",
    "  do_sample=True,\n",
    "  num_return_sequences=5,\n",
    "  max_length = 50,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SZ4uCZaZG68k"
   },
   "source": [
    "Lastly, we want to see what the model actually created. To do this, we need to decode the tokens from ids back to words and remove the EOS tokens. You can easily use this code with another dataset. Don't forget to share your adapters at [AdapterHub](https://adapterhub.ml/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "TX3QpuWPtrol",
    "outputId": "e5260e2e-7681-49a7-d05e-6f4c4f7aeeef",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "=== GENERATED SEQUENCE 1 ===\n",
      "In the night, he would go;and she is the queen, and a mistress,and she keeps in the nightthe king who died\" (the \"giant,\" said the ancient, as a poet)and a child in his home\n",
      "=== GENERATED SEQUENCE 2 ===\n",
      "In the night,when one thinks of the war upon the world, and of men who live in it;that's all you have, though, that's all, that's what you want. and that makes me want, but here's th\n",
      "=== GENERATED SEQUENCE 3 ===\n",
      "In the night, she was the first, for once, the girl of good cheer!--of the people, the love of her life, she has not come to see her sister again;yet i think if i could not have loved her I wer\n",
      "=== GENERATED SEQUENCE 4 ===\n",
      "In the night, she sang the sweetest lullaby of morning-the very sound he heard:the silent and delicate voice of the holy sea,that his face would not come to grief.a quiet and silent night,the song as always i\n",
      "=== GENERATED SEQUENCE 5 ===\n",
      "In the nighttime, the king says:but there can be no peace or sorrow if that night's not a blessing,the only hope to her heart lies in the bright day.a good old fool, like a son of a friend,ho\n"
     ]
    }
   ],
   "source": [
    " for generated_sequence_idx, generated_sequence in enumerate(output_sequence):\n",
    "        print(\"=== GENERATED SEQUENCE {} ===\".format(generated_sequence_idx + 1))\n",
    "        generated_sequence = generated_sequence.tolist()\n",
    "\n",
    "        # Decode text\n",
    "        text = tokenizer.decode(generated_sequence, clean_up_tokenization_spaces=True)\n",
    "        # Remove EndOfSentence Tokens\n",
    "        text = text[: text.find(tokenizer.eos_token)]\n",
    "\n",
    "        print(text)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "05_Text_Generation.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "source": [],
    "metadata": {
     "collapsed": false
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}