{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# Tutorial 01 - Data Preprocessing\n",
    "\n",
    "We provide a converter to convert raw datasets into [CommonRoad scenarios](https://commonroad.in.tum.de/scenarios) in `.xml` format. The public converter is available [here](https://commonroad.in.tum.de/dataset-converters). In addition, CommonRoad-RL provides tools (`./commonroad_rl/tools/pickle_scenario`) to convert `.xml` scenarios to `.pickle` format to save loading time for the training. \n",
    "\n",
    "This tutorial shows how to utilize the tools to prepare training and testing data for the highD dataset. A similar procedure follows for the inD dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 0. Preparation\n",
    "Please follow the README.md to install the CommonRoad-RL package and make sure the followings:\n",
    "* current path is at `commonroad-rl/commonroad_rl`, i.e. one upper layer to the `tutorials` folder\n",
    "* interactive python kernel is triggered from the correct environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "# Check current path\n",
    "%cd ..\n",
    "%pwd\n",
    "\n",
    "# Check interactive python kernel\n",
    "import sys\n",
    "sys.executable"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 1. Acquire the dataset\n",
    "To download the whole raw highD dataset, please go to [the highD home page](https://www.highd-dataset.com).\n",
    "\n",
    "To facilitate the following exercises, we have prepared sample data under `tutorials/data/highd/raw`, where you should see three csv files recording the track information and one jpg file showing the track background."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 2. Convert raw .csv data to .xml files\n",
    "\n",
    "Clone and install [dataset-converters](https://gitlab.lrz.de/tum-cps/dataset-converters/-/tree/master) in `commonroad-rl/install` folder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "%cd ../install/\n",
    "!git clone https://gitlab.lrz.de/tum-cps/dataset-converters.git\n",
    "%cd dataset-converters\n",
    "!pip install -r requirements.txt "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m src.main highD ../../commonroad_rl/tutorials/data/highD/raw/ ../../commonroad_rl/tutorials/data/highD/xmls/ --num_time_steps_scenario 1000 \n",
    "%cd ../../commonroad_rl"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now there should be 50 `.xml` files in the output folder `tutorials/data/highd/xmls`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 3. Validate .xml files against CommonRoad .xsd specification\n",
    "\n",
    "To check if the converted `.xml` files comply with the CommonRoad scenario format, use the validation tool in `commonroad_rl/tools`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m commonroad_rl.tools.validate_cr -s tools/XML_commonRoad_XSD_2020a.xsd tutorials/data/highD/xmls/*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 4. Visualize CommonRoad scenarios\n",
    "There is a visualization tool in `commonroad_rl/tools`, which can be executed by a simple command at the terminal; for example,  \n",
    "`python -m commonroad_rl.tools.visualize_cr tutorials/data/highD/xmls/DEU_LocationB-3_1_T-1.xml`. \n",
    "\n",
    "However, this script does not work for Jupyter notebook because of a backend error. Therefore, we utilize here the `commonroad-io` package. Let's try it with a sample scenario."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import os\n",
    "import glob\n",
    "import matplotlib.pyplot as plt\n",
    "from IPython import display\n",
    "\n",
    "from commonroad.common.file_reader import CommonRoadFileReader\n",
    "from commonroad.visualization.draw_dispatch_cr import draw_object\n",
    "\n",
    "files = \"tutorials/data/highD/xmls/*.xml\"\n",
    "file_path = sorted(glob.glob(files))[0]\n",
    "\n",
    "# Read in the scenario and planning problem set\n",
    "scenario, planning_problem_set = CommonRoadFileReader(file_path).open()\n",
    "\n",
    "# Plot the scenario for 40 time step, here each time step corresponds to 0.1 second\n",
    "for i in range(0, 40):\n",
    "    # Comment line below to keep sequence of graphs\n",
    "    display.clear_output(wait=True)\n",
    "    \n",
    "    plt.figure(figsize=(20, 10))\n",
    "    \n",
    "    # Plot the scenario at different time step\n",
    "    draw_object(scenario, draw_params={'time_begin': i})\n",
    "    \n",
    "    # Plot the planning problem set\n",
    "    draw_object(planning_problem_set)\n",
    "    plt.gca().set_aspect('equal')\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 5. Convert .xml files to .pickle data\n",
    "Since an RL training/testing session involves tens of thousands of iterations and accesses to the scenarios, it is a good idea to convert the `.xml` files to `.pickle` format so that they will be loaded more efficiently during training and testing. For example, loading 3000 `.xml` files takes about 2h while loading the same amount of `.pickle` files takes only 10min.\n",
    "\n",
    "Furthermore, this script separates road networks and obstacles since lots of scenario could share the road network data. Road networks are stored in `meta_scenario` folder, whereas obstacles are stored in the `problem` folder. This is done with a conversion tool in `commonroad_rl/tools/pickle_scenario`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m commonroad_rl.tools.pickle_scenario.xml_to_pickle -i tutorials/data/highD/xmls -o tutorials/data/highD/pickles"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now in the output folder `tutorials/data/highD/pickles`, there should be a `meta_scenario` folder containing meta information and a `problem` folder containing 50 `.pickle` files."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 6. Split .pickle data for training and testing\n",
    "As a final step, let's split the 50 problems into training and testing sets with a ratio of 7:3 randomly, again using a provided script in `commonroad_rl/utils_run`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m commonroad_rl.utils_run.split_dataset -i tutorials/data/highD/pickles/problem -otrain tutorials/data/highD/pickles/problem_train -otest tutorials/data/highD/pickles/problem_test -tr_r 0.7"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now in `tutorials/data/highD/pickles`, there should be a `problem_train` folder containing 35 pickles and a `problem_test` folder containing 15 pickles."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "**Note**: For each data conversion step, we provide bash script to enable converting the data on multiple threads. Please use those scripts instead if you want to convert the whole dataset to save runtime."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## 7. Separate training data for multi envs (skip this step if not using multi env)\n",
    "To train the model on mulitple envs, the scenarios need to be separated into different files. we can use a provided script in `commonroad_rl/tools/pickles_scenario` to do it  \n",
    "Here is an example to separate all .pickles files (both train and test) into 5 folders"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m commonroad_rl.tools.pickle_scenario.copy_files -i tutorials/data/highD/pickles/problem_train -o tutorials/data/highD/pickles/problem_train -f *.pickle -n 5 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!python -m commonroad_rl.tools.pickle_scenario.copy_files -i tutorials/data/highD/pickles/problem_test -o tutorials/data/highD/pickles/problem_test -f *.pickle -n 5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now in the output folder `tutorials/data/highD/pickles/problem_train` and `tutorials/data/highD/pickles/problem_test`, you should have 5 folders name `0`,`1`,`2`,`3`,`4`, each contains different part of the scenarios."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.11"
  },
  "metadata": {
   "interpreter": {
    "hash": "93ddbae6933170d26757955f07afdbe3a82b6d7d85b7f8bb469b70bc7f83bf90"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}