{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# References\n",
    "- Optimization of dataflow pipeline: tf.data.(experimental.)AUTOTUNE, dataset.cache(), and dataset.prefetch(buffer_size=AUTOTUNE)\n",
    "    - https://www.tensorflow.org/tutorials/load_data/images#configure_the_dataset_for_performance\n",
    "    - https://www.tensorflow.org/tutorials/load_data/images#configure_dataset_for_performance\n",
    "\n",
    "- Interested readers can learn more about both methods, as well as how to cache data to disk in the data performance guide:\n",
    "    - https://www.tensorflow.org/guide/data_performance#optimize_performance\n",
    "\n",
    "- tf.data.Dataset methods\n",
    "    - https://www.tensorflow.org/api_docs/python/tf/data/Dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TL;DR\n",
    "   `ds = tfds.load(...)`\n",
    "\n",
    "-> `ds.map(time-consuming maps, buffer_size=tf.data.experimental.AUTOTUNE)`\n",
    "\n",
    "-> `ds.cache()` \n",
    "\n",
    "-> `ds.map(memory-consuming maps, buffer_size=tf.data.experimental.AUTOTUNE)`\n",
    "\n",
    "-> `ds.shuffle(...)` \n",
    "\n",
    "-> `ds.batch(buffer_size=tf.data.experimental.AUTOTUNE if TFver>=2.5.0)` \n",
    "\n",
    "-> `ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "import tensorflow.keras.applications as tka\n",
    "import tensorflow_datasets as tfds\n",
    "\n",
    "def set_gpu_devices(gpu):\n",
    "    physical_devices = tf.config.experimental.list_physical_devices('GPU')\n",
    "    assert len(physical_devices) > 0, \"Not enough GPU hardware devices available\"\n",
    "    tf.config.experimental.set_visible_devices(physical_devices[gpu], 'GPU')\n",
    "    tf.config.experimental.set_memory_growth(physical_devices[gpu], True)\n",
    "    \n",
    "set_gpu_devices(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Experiment 1: batch and shuffle are in data pipeline and preprocess is in data pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\")\n",
    "dste = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"test\")\n",
    "    \n",
    "# Dataset decoration\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"]),\n",
    "    num_parallel_calls=tf.data.experimental.AUTOTUNE\n",
    "    )\n",
    "dstr = dstr.cache() \n",
    "dstr = dstr.shuffle(buffer_size=10000, reshuffle_each_iteration=True).\\\n",
    "    batch(batch_size=batch_size, drop_remainder=False)\n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "0 1000\n",
      "0 2000\n",
      "0 3000\n",
      "0 4000\n",
      "0 5000\n",
      "0 6000\n",
      "0 7000\n",
      "0 8000\n",
      "0 9000\n",
      "0 10000\n",
      "0 11000\n",
      "0 12000\n",
      "0 13000\n",
      "0 14000\n",
      "0 15000\n",
      "0 16000\n",
      "0 17000\n",
      "0 18000\n",
      "0 19000\n",
      "0 20000\n",
      "0 21000\n",
      "0 22000\n",
      "0 23000\n",
      "0 24000\n",
      "0 25000\n",
      "0 26000\n",
      "1 0\n",
      "1 1000\n",
      "1 2000\n",
      "1 3000\n",
      "1 4000\n",
      "1 5000\n",
      "1 6000\n",
      "1 7000\n",
      "1 8000\n",
      "1 9000\n",
      "1 10000\n",
      "1 11000\n",
      "1 12000\n",
      "1 13000\n",
      "1 14000\n",
      "1 15000\n",
      "1 16000\n",
      "1 17000\n",
      "1 18000\n",
      "1 19000\n",
      "1 20000\n",
      "1 21000\n",
      "1 22000\n",
      "1 23000\n",
      "1 24000\n",
      "1 25000\n",
      "1 26000\n",
      "17.177741289138794 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## with `dstr.cache()` (recommended)\n",
    "### First run\n",
    "- j = 0: %CPU=430, %MEM=0--5.8\n",
    "- j = 1: %CPU=150, %MEM=5.8, \n",
    "- 73 seconds\n",
    "- AUTOTUNE in .map => 27 seconds, 800%CPU\n",
    "\n",
    "### Second run\n",
    "- j = 0: %CPU=150, %MEM=5.8\n",
    "- j = 1: %CPU=150, %MEM=5.8, \n",
    "- 15--24 seconds\n",
    "- AUTOTUNE in .map => 17 seconds\n",
    "\n",
    "## without `dstr.cache()` (not recommended)\n",
    "\n",
    "### First run\n",
    "- j = 0: %CPU=430, %MEM=0--1.6\n",
    "- j = 1: %CPU=450, %MEM=1.6\n",
    "- 111 seconds\n",
    "\n",
    "### Second run\n",
    "- j = 0: %CPU=450, %MEM=2.8\n",
    "- j = 1: %CPU=450, %MEM=2.8\n",
    "- 103 seconds\n",
    "\n",
    "## Sumamry\n",
    "With .cache is recommended.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Experiment 2: batch and shuffle are done in tfds.load and preprocess is in the train loop (not recommended)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\", batch_size=batch_size, shuffle_files=True)\n",
    "dstr = dstr.cache()\n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "0 1000\n",
      "0 2000\n",
      "0 3000\n",
      "0 4000\n",
      "0 5000\n",
      "0 6000\n",
      "0 7000\n",
      "0 8000\n",
      "0 9000\n",
      "0 10000\n",
      "0 11000\n",
      "0 12000\n",
      "0 13000\n",
      "0 14000\n",
      "0 15000\n",
      "0 16000\n",
      "0 17000\n",
      "0 18000\n",
      "0 19000\n",
      "0 20000\n",
      "0 21000\n",
      "0 22000\n",
      "0 23000\n",
      "0 24000\n",
      "0 25000\n",
      "0 26000\n",
      "1 0\n",
      "1 1000\n",
      "1 2000\n",
      "1 3000\n",
      "1 4000\n",
      "1 5000\n",
      "1 6000\n",
      "1 7000\n",
      "1 8000\n",
      "1 9000\n",
      "1 10000\n",
      "1 11000\n",
      "1 12000\n",
      "1 13000\n",
      "1 14000\n",
      "1 15000\n",
      "1 16000\n",
      "1 17000\n",
      "1 18000\n",
      "1 19000\n",
      "1 20000\n",
      "1 21000\n",
      "1 22000\n",
      "1 23000\n",
      "1 24000\n",
      "1 25000\n",
      "1 26000\n",
      "36.08529877662659 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        preproc(tf.cast(feat[\"image\"][0], dtype=tf.float32))\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### with and without .cache & with and without .prefetch\n",
    "\n",
    "36 seconds\n",
    "\n",
    "### Summary\n",
    "`preproc` in data pipeline is recommended."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Experiment 3: batch and shuffle are done in tfds.load and preprocess is in data pipeline (recommended)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\", batch_size=batch_size, shuffle_files=True)\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"])) # no AUTOTUNE\n",
    "#dstr = dstr.cache() # do not use; otherwise shuffle is disabled (shuffle->cache is not permitted)\n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 1000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 2000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 3000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 4000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 5000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 6000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 7000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 8000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 9000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 10000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 11000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 12000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 13000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 14000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 15000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 16000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 17000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 19000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 20000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 21000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 22000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 23000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 24000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 25000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 26000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 0\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 1000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 2000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 3000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 4000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 5000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 6000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 7000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 8000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 9000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 10000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 11000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 12000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 13000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 14000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 15000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 16000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 17000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 19000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 20000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 21000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 22000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 23000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 24000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 25000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 26000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "19.577118396759033 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "            print(feat[1][0])\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### without .cache (recommended)\n",
    "- %CPU 2000 %MEM 0.6\n",
    "- 21 seconds\n",
    "\n",
    "If without .prefetch (not recommended)\n",
    "- %CPU 1300 %MEM 0.5\n",
    "- 36 seconds\n",
    "- Duration is the same as that in Experiment 2; preproc is the bottleneck because, here, no prefetch=no parallelism.\n",
    "\n",
    "### with .cache (Don't use: shuffle is automatically disabled!!!)\n",
    "First run\n",
    "- %CPU 1500 to 100 %MEM 7.3\n",
    "- 18 seconds\n",
    "\n",
    "Second run\n",
    "- %CPU  100 %MEM 7.3, \n",
    "- 6 seconds\n",
    "\n",
    "### Summary\n",
    "- \"batch and shuffle done in tfds.load\" (Experiment 3) gives 1. higher CPU costs (2000) and 2. lower MEMORY (0.6)\n",
    "- CPU AUTOTUNE: Not sure\n",
    "\n",
    "while\n",
    "- \"batch and shuffle done in data pipeline\" (Experiment 1), which gives 1. lower CPU costs (150--430) and 2. higher MEMORY (5.8).\n",
    "- CPU AUTOTUNE: YES\n",
    "\n",
    "TIPS: .map関数はまとめた方がはやいかも？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "======================================================================================="
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Experiment 4: Exp. 1 and 3 with AUTOTUNE in .map"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exp. 1 mod."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\")\n",
    "dste = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"test\")\n",
    "    \n",
    "# Dataset decoration\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"]),\n",
    "    num_parallel_calls=tf.data.experimental.AUTOTUNE # <=============================\n",
    "    ) \n",
    "#dstr = dstr.cache() \n",
    "dstr = dstr.shuffle(buffer_size=10000, reshuffle_each_iteration=True).\\\n",
    "    batch(batch_size=batch_size, drop_remainder=False) \n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "0 1000\n",
      "0 2000\n",
      "0 3000\n",
      "0 4000\n",
      "0 5000\n",
      "0 6000\n",
      "0 7000\n",
      "0 8000\n",
      "0 9000\n",
      "0 10000\n",
      "0 11000\n",
      "0 12000\n",
      "0 13000\n",
      "0 14000\n",
      "0 15000\n",
      "0 16000\n",
      "0 17000\n",
      "0 18000\n",
      "0 19000\n",
      "0 20000\n",
      "0 21000\n",
      "0 22000\n",
      "0 23000\n",
      "0 24000\n",
      "0 25000\n",
      "0 26000\n",
      "1 0\n",
      "1 1000\n",
      "1 2000\n",
      "1 3000\n",
      "1 4000\n",
      "1 5000\n",
      "1 6000\n",
      "1 7000\n",
      "1 8000\n",
      "1 9000\n",
      "1 10000\n",
      "1 11000\n",
      "1 12000\n",
      "1 13000\n",
      "1 14000\n",
      "1 15000\n",
      "1 16000\n",
      "1 17000\n",
      "1 18000\n",
      "1 19000\n",
      "1 20000\n",
      "1 21000\n",
      "1 22000\n",
      "1 23000\n",
      "1 24000\n",
      "1 25000\n",
      "1 26000\n",
      "16.89484214782715 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## without AUTOTUNE in .map (Experiment 1)\n",
    "### First run\n",
    "- j = 0: %CPU=430, %MEM=0--5.8\n",
    "- j = 1: %CPU=150, %MEM=5.8, \n",
    "- 73 seconds\n",
    "\n",
    "### Second run\n",
    "- j = 0: %CPU=150, %MEM=5.8\n",
    "- j = 1: %CPU=150, %MEM=5.8, \n",
    "- 15--24 seconds\n",
    "\n",
    "## with AUTOTUNE in .map with .cache (not recommended)\n",
    "### First run\n",
    "- %CPU %2500--150, %MEM=6.5 \n",
    "- 22--25 seconds\n",
    "\n",
    "### Second run\n",
    "- %CPU %150, %MEM=6.5 \n",
    "- 29--33 seconds\n",
    "\n",
    "- => The first epoch in the first run is very fast (because of AUTOTUNE in .map), but after that, sudden decelleration (Maybe .cache is not utilized because AUTOTUNE may shuffle input stream before .cache).\n",
    "\n",
    "## with AUTOTUNE in .map without .cache (recommended)\n",
    "- %CPU %3000, %MEM=8.0 \n",
    "- 17 seconds\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exp. 3 mod."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\", batch_size=batch_size, shuffle_files=True)\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"]),\n",
    "    num_parallel_calls=tf.data.experimental.AUTOTUNE) # <=============================\n",
    "#dstr = dstr.cache() # do not use; otherwise shuffle is disabled (shuffle->cache is not permitted)\n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 1000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 2000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 3000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 4000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 5000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 6000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 7000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 8000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 9000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 10000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 11000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 12000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 13000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 14000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 15000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 16000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 17000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 19000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 20000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 21000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 22000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 23000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 24000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 25000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 26000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 0\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 1000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 2000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 3000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 4000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 5000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 6000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 7000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 8000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 9000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 10000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 11000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 12000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 13000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 14000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 15000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 16000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 17000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 19000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 20000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 21000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 22000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 23000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 24000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 25000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 26000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "13.443394422531128 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "            print(feat[1][0])\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## without AUTOTUNE in .map (Exp. 3)\n",
    "### without .cache (recommended)\n",
    "- %CPU 2000, %MEM 0.6\n",
    "- 21 seconds\n",
    "\n",
    "## with AUTOTUNE  in .map (recommended)\n",
    "- %CPU 4000, %MEM 0.6\n",
    "- 13 seconds\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Experiment 5: CPU and MEMORY efficient version...?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exp. 1 mod."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\")\n",
    "    \n",
    "# Dataset decoration\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"])) # no AUTOTUNE\n",
    "#dstr = dstr.cache() \n",
    "dstr = dstr.shuffle(buffer_size=10000, reshuffle_each_iteration=True).\\\n",
    "    batch(batch_size=batch_size, drop_remainder=False) # no AUTOTUNE in .batch\n",
    "dstr = dstr.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "0 1000\n",
      "0 2000\n",
      "0 3000\n",
      "0 4000\n",
      "0 5000\n",
      "0 6000\n",
      "0 7000\n",
      "0 8000\n",
      "0 9000\n",
      "0 10000\n",
      "0 11000\n",
      "0 12000\n",
      "0 13000\n",
      "0 14000\n",
      "0 15000\n",
      "0 16000\n",
      "0 17000\n",
      "0 18000\n",
      "0 19000\n",
      "0 20000\n",
      "0 21000\n",
      "0 22000\n",
      "0 23000\n",
      "0 24000\n",
      "0 25000\n",
      "0 26000\n",
      "1 0\n",
      "1 1000\n",
      "1 2000\n",
      "1 3000\n",
      "1 4000\n",
      "1 5000\n",
      "1 6000\n",
      "1 7000\n",
      "1 8000\n",
      "1 9000\n",
      "1 10000\n",
      "1 11000\n",
      "1 12000\n",
      "1 13000\n",
      "1 14000\n",
      "1 15000\n",
      "1 16000\n",
      "1 17000\n",
      "1 18000\n",
      "1 19000\n",
      "1 20000\n",
      "1 21000\n",
      "1 22000\n",
      "1 23000\n",
      "1 24000\n",
      "1 25000\n",
      "1 26000\n",
      "104.29191732406616 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## without AUTOTUNE in .map (Exp. 1)\n",
    "### without .cache (recommended)\n",
    "- %CPU 430--150, %MEM 5.8\n",
    "- 73--15 seconds\n",
    "\n",
    "## with AUTOTUNE  in .map (Exp. 4-1)\n",
    "- %CPU 3000, %MEM 8.0 \n",
    "- 17 seconds\n",
    "\n",
    "## without AUTOTUNE in .map and without .cache and without .prefetch (not recommended)\n",
    "- %CPU 430 , %MEM 1.9 (memory leak? or depends on buffer_size of .shuffle?)\n",
    "-  113 seconds\n",
    "\n",
    "## without AUTOTUNE in .map and without .cache and with .prefetch (not recommended)\n",
    "- %CPU 460 , %MEM 1.9 (memory leak? or depends on buffer_size of .shuffle?)\n",
    "-  104 seconds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exp. 3 mod."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Pre-defined\n",
    "batch_size = 10\n",
    "preproc = tka.resnet_v2.preprocess_input\n",
    "\n",
    "# Load dataset\n",
    "dstr = tfds.load(\n",
    "    \"patch_camelyon\", \n",
    "    data_dir=\"/data/t-miyagawa/tensorflow_datasets\", \n",
    "    split=\"train\", batch_size=batch_size, shuffle_files=True)\n",
    "dstr = dstr.map(\n",
    "    lambda x: (preproc(tf.cast(x[\"image\"], dtype=tf.float32)), x[\"label\"]),\n",
    "    #num_parallel_calls=tf.data.experimental.AUTOTUNE\n",
    "    ) \n",
    "#dstr = dstr.cache() # do not use; otherwise shuffle is disabled (shuffle->cache is not permitted)\n",
    "dstr = dstr.prefetch(\n",
    "    buffer_size=tf.data.experimental.AUTOTUNE\n",
    "    )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 0\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 1000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 2000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 3000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 4000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 5000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 6000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 7000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 8000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 9000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 10000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 11000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 12000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 13000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 14000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 15000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 16000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 17000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 19000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 20000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 21000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "0 22000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 23000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 24000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 25000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "0 26000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 0\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 1000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 2000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 3000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 4000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 5000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 6000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 7000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 8000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 9000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 10000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 11000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 12000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 13000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 14000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 15000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 16000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 17000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 18000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 19000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 20000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 21000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 22000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 23000\n",
      "tf.Tensor(1, shape=(), dtype=int64)\n",
      "1 24000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 25000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "1 26000\n",
      "tf.Tensor(0, shape=(), dtype=int64)\n",
      "19.613698482513428 sec\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "tic = time.time()\n",
    "for j in range(2):\n",
    "    for i, feat in enumerate(dstr):\n",
    "        if i % 1000 == 0:\n",
    "            print(j, i)\n",
    "            print(feat[1][0])\n",
    "print(time.time() - tic, \"sec\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## without AUTOTUNE in .map (Exp. 3) \n",
    "- %CPU 2000, %MEM 0.6\n",
    "- 21 seconds\n",
    "\n",
    "## with AUTOTUNE  in .map (Exp. 4-2)\n",
    "- %CPU 4000, %MEM 0.6\n",
    "- 13 seconds\n",
    "\n",
    "## without AUTOTUNE in .map and without .prefetch (use the first one instead?)\n",
    "- %CPU 1300, %MEM 0.6\n",
    "- 34 seconds\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary of Experiment 4 and 5\n",
    "- Ver A: 15(73 before caching)sec: \"batch and shuffle done in data pipeline\" (Experiment 1), which gives 1. lower CPU costs (150--430 or 800 if AUTOTUNE in .map) and 2. higher MEMORY (5.8).\n",
    "- Ver C: 17sec: \"Based on Exp 1., with AUTOTUNE in .map and without .cache\" (Experiment 4-1) gives 1. much higher CPU costs (3000) and 2. more higher MEMORY (8.0).\n",
    "\n",
    "while\n",
    "- Ver B: 21sec: \"batch and shuffle done in tfds.load\" (Experiment 3) gives 1. higher CPU costs (2000) and 2. lower MEMORY (0.6)\n",
    "- Ver D: 13sec: \"batch and shuffle done in tfds.load\" (Experiment 4-2) gives 1. much higher CPU costs and 2. lower CPU costs (0.6).\n",
    "\n",
    "Therefore,\n",
    "- Ver A (Exp1 w/o AUTOTUNEmap): 🌟Low CPU & High MEM. Default. If memory is not enough, use Ver D instead.\n",
    "- Ver B (Exp3 w/o AUTOTUNEmap): Not used (use Ver D instead, but less CPU consumption than Ver D anyway)\n",
    "- Ver C (Exp1 w/  AUTOTUNEmap w/ocache): Not used (use Ver A instead)\n",
    "- Ver D (Exp3 w/  AUTOTUNEmap): High CPU & Low MEM.\n",
    "are good?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
