{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Mw8dzPy3-UnJ"
   },
   "source": [
    "# 1. Tabular dataset -> Graph dataset\n",
    "\n",
    "I hope this notebook helps you to convert your CSV file into a graph dataset 🚀\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NRKTBYp7-2VT"
   },
   "source": [
    "`Step 0`\n",
    "\n",
    "Bring some creativity and lose hope 😀 - It is very natural that it takes some time to rearrange the data in a graph format. Also, this notebook is only to help you to get started (you will have to transfer it to your specific use-case).\n",
    "\n",
    "`Step 1`\n",
    "\n",
    "To get started, identify the following things in your dataset (I have some real-world examples below in this notebook):\n",
    "\n",
    "- Nodes (Items, People, Locations, Cars, ...)\n",
    "- Edges (Connections, Interactions, Similarity, ...)\n",
    "- Node Features (Attributes)\n",
    "- Labels (Node-level, edge-level, graph-level)\n",
    "\n",
    "and optionally:\n",
    "- Edge weights (Strength of the connection, number of interactions, ...)\n",
    "- Edge features (Additional (multi-dim) properties describing the edge)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "q0FoZDXjC_BX"
   },
   "source": [
    "`Step 2`\n",
    "\n",
    "Do you have different node and edge types? (This means the nodes/edges have different attributes such as Cars vs. People)\n",
    "\n",
    "- No, all my edges/nodes have the same type  --> **Proceed with 1.1**\n",
    "- Yes, there are different relations and node types --> **Proceed with 1.2**\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "C4CFR0Ye_xNJ"
   },
   "source": [
    "## 1.1 Homogeneous"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QD36bQfiDhRo"
   },
   "source": [
    "`Example 1 / Step 3`\n",
    "\n",
    "To make it as realistic as possible, I selected a random dataset I found on the internet that contains homogeneous nodes. This dataset is the [FIFA 21 Rating dataset](https://raw.githubusercontent.com/batuhan-demirci/fifa21_dataset), a dataset with soccer players.\n",
    "Here we extract a small subset of the scraped data (there is much more available!) to build a graph dataset out of it. Have a look at the pandas Dataframe below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 223
    },
    "id": "yn1TcPU9GU0T",
    "outputId": "4add1b84-2ce2-4549-a3ee-377cb0509ec6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Players:  18767\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-0317a4da-d97b-43af-bf61-52f31deb57cc\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>int_player_id</th>\n",
       "      <th>str_player_name</th>\n",
       "      <th>str_positions</th>\n",
       "      <th>int_overall_rating</th>\n",
       "      <th>int_team_id</th>\n",
       "      <th>int_long_passing</th>\n",
       "      <th>int_ball_control</th>\n",
       "      <th>int_dribbling</th>\n",
       "      <th>str_team_name</th>\n",
       "      <th>int_overall</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Lionel Andrés Messi Cuccittini</td>\n",
       "      <td>RW, ST, CF</td>\n",
       "      <td>93</td>\n",
       "      <td>5.0</td>\n",
       "      <td>91</td>\n",
       "      <td>96</td>\n",
       "      <td>96</td>\n",
       "      <td>FC Barcelona</td>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>2</td>\n",
       "      <td>Cristiano Ronaldo dos Santos Aveiro</td>\n",
       "      <td>ST, LW</td>\n",
       "      <td>92</td>\n",
       "      <td>6.0</td>\n",
       "      <td>77</td>\n",
       "      <td>92</td>\n",
       "      <td>88</td>\n",
       "      <td>Juventus</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>3</td>\n",
       "      <td>Jan Oblak</td>\n",
       "      <td>GK</td>\n",
       "      <td>91</td>\n",
       "      <td>8.0</td>\n",
       "      <td>40</td>\n",
       "      <td>30</td>\n",
       "      <td>12</td>\n",
       "      <td>Atlético Madrid</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121</th>\n",
       "      <td>5</td>\n",
       "      <td>Neymar da Silva Santos Júnior</td>\n",
       "      <td>LW, CAM</td>\n",
       "      <td>91</td>\n",
       "      <td>7.0</td>\n",
       "      <td>81</td>\n",
       "      <td>95</td>\n",
       "      <td>95</td>\n",
       "      <td>Paris Saint-Germain</td>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>4</td>\n",
       "      <td>Kevin De Bruyne</td>\n",
       "      <td>CAM, CM</td>\n",
       "      <td>91</td>\n",
       "      <td>2.0</td>\n",
       "      <td>93</td>\n",
       "      <td>92</td>\n",
       "      <td>88</td>\n",
       "      <td>Manchester City</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0317a4da-d97b-43af-bf61-52f31deb57cc')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-0317a4da-d97b-43af-bf61-52f31deb57cc button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-0317a4da-d97b-43af-bf61-52f31deb57cc');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "     int_player_id                      str_player_name str_positions  \\\n",
       "0                1       Lionel Andrés Messi Cuccittini    RW, ST, CF   \n",
       "33               2  Cristiano Ronaldo dos Santos Aveiro        ST, LW   \n",
       "57               3                            Jan Oblak            GK   \n",
       "121              5        Neymar da Silva Santos Júnior       LW, CAM   \n",
       "89               4                      Kevin De Bruyne       CAM, CM   \n",
       "\n",
       "     int_overall_rating  int_team_id  int_long_passing  int_ball_control  \\\n",
       "0                    93          5.0                91                96   \n",
       "33                   92          6.0                77                92   \n",
       "57                   91          8.0                40                30   \n",
       "121                  91          7.0                81                95   \n",
       "89                   91          2.0                93                92   \n",
       "\n",
       "     int_dribbling        str_team_name  int_overall  \n",
       "0               96         FC Barcelona           84  \n",
       "33              88             Juventus           83  \n",
       "57              12      Atlético Madrid           83  \n",
       "121             95  Paris Saint-Germain           83  \n",
       "89              88      Manchester City           85  "
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# Download data (quietly)\n",
    "!wget -q https://raw.githubusercontent.com/batuhan-demirci/fifa21_dataset/master/data/tbl_player.csv\n",
    "!wget -q https://raw.githubusercontent.com/batuhan-demirci/fifa21_dataset/master/data/tbl_player_skill.csv\n",
    "!wget -q https://raw.githubusercontent.com/batuhan-demirci/fifa21_dataset/master/data/tbl_team.csv\n",
    "\n",
    "# Load data\n",
    "player_df = pd.read_csv(\"tbl_player.csv\")\n",
    "skill_df = pd.read_csv(\"tbl_player_skill.csv\")\n",
    "team_df = pd.read_csv(\"tbl_team.csv\")\n",
    "\n",
    "# Extract subsets\n",
    "player_df = player_df[[\"int_player_id\", \"str_player_name\", \"str_positions\", \"int_overall_rating\", \"int_team_id\"]]\n",
    "skill_df = skill_df[[\"int_player_id\", \"int_long_passing\", \"int_ball_control\", \"int_dribbling\"]]\n",
    "team_df = team_df[[\"int_team_id\", \"str_team_name\", \"int_overall\"]]\n",
    "\n",
    "# Merge data\n",
    "player_df = player_df.merge(skill_df, on='int_player_id')\n",
    "fifa_df = player_df.merge(team_df, on='int_team_id')\n",
    "\n",
    "# Sort dataframe\n",
    "fifa_df = fifa_df.sort_values(by=\"int_overall_rating\", ascending=False)\n",
    "print(\"Players: \", fifa_df.shape[0])\n",
    "fifa_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nyhpJPakTylS"
   },
   "source": [
    "Let's first identify the graph-specific things we need:\n",
    "\n",
    "- `Nodes` - Football players (by ID)\n",
    "- `Edges` - If they play for the same team (see explanation below)\n",
    "- `Node Features` - The football player's position, specialities, ball control, ...\n",
    "- `Labels` - The football player's overall rating (node-level regression task)\n",
    "\n",
    "\n",
    "Nodes are usually very straight-forward to identify - here we even have IDs.\n",
    "If you don't have a unique identifier, you need one, because you need to know between which nodes a connection exists!\n",
    "\n",
    " The most challenging task is typically to link these nodes somehow through edges. Here we define the edges based on the team assignment. With this dataset, we could predict the expected rating when a player switches to a new team or a new player is observed. Therefore we expect relational effects through the team assignment. Of course there are many other possibilities to define the edges such as:\n",
    "- How many times two players played together (edge weight) --> Synergies\n",
    "- How many times a player has won/los 1:1 duels (edge weight)\n",
    "- Started their career in the same football club \n",
    "- Temporal edges: \"Played together in the last 2 weeks\"\n",
    "- ...\n",
    "\n",
    "As you can see, there are many choices how to combine instances in the dataframe. We will continue with the easiest approach, which is connecting them accoring to their team assignments.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "guvNHrTvTxfC",
    "outputId": "b3dcad5b-bed8-4330-ff88-8ee48c14997e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 90,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Make sure that we have no duplicate nodes\n",
    "max(fifa_df[\"int_player_id\"].value_counts())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "cgTP2w1wZZav"
   },
   "source": [
    "Each football player ID occurs only once in our dataset. \n",
    "\n",
    "\n",
    "\n",
    "> Note that we plan to build one single graph here! If individual node-id's occur more than once in your dataset, there are different options:\n",
    "\n",
    "- You have multiple graphs that can contain the same node. In this case you need to iterate over each subset of your dataframe, that belongs to one individual graph and do the calculations on this subset\n",
    "- You have to aggregate multiple rows into one. For example if you have transactional data (like a payment history), you would need to summarize this somehow into one feature vector, such as: #payments, payment amount, ...\n",
    "- You have a temporal dataset and need to check section 2.)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RJwrgNLUe6hE"
   },
   "source": [
    "`Preprocessing one single graph...`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fHJG9FnHYJNB"
   },
   "source": [
    "`Step 4`: Extract the node features\n",
    "\n",
    "The node features are typically represented in a matrix of the shape *(num_nodes, node_feature_dim)*.\n",
    "\n",
    "For each of the football players, we simply extract their attributes. Because each player id is unique, we can easily do this based on the original dataframe. Have a look at the other examples in this notebook to see how an aggregation can look like if you have multiple rows for individual nodes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "V1bmPva7a0JD",
    "outputId": "ea6dee39-a163-4362-e768-447095c144a8"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-73290a0e-3ea1-49af-9b91-e9d519d1160d\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>int_long_passing</th>\n",
       "      <th>int_ball_control</th>\n",
       "      <th>int_dribbling</th>\n",
       "      <th>CAM</th>\n",
       "      <th>CB</th>\n",
       "      <th>CDM</th>\n",
       "      <th>CF</th>\n",
       "      <th>CM</th>\n",
       "      <th>GK</th>\n",
       "      <th>LB</th>\n",
       "      <th>LM</th>\n",
       "      <th>LW</th>\n",
       "      <th>LWB</th>\n",
       "      <th>RB</th>\n",
       "      <th>RM</th>\n",
       "      <th>RW</th>\n",
       "      <th>RWB</th>\n",
       "      <th>ST</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>91</td>\n",
       "      <td>96</td>\n",
       "      <td>96</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>77</td>\n",
       "      <td>92</td>\n",
       "      <td>88</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>40</td>\n",
       "      <td>30</td>\n",
       "      <td>12</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121</th>\n",
       "      <td>81</td>\n",
       "      <td>95</td>\n",
       "      <td>95</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>93</td>\n",
       "      <td>92</td>\n",
       "      <td>88</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-73290a0e-3ea1-49af-9b91-e9d519d1160d')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-73290a0e-3ea1-49af-9b91-e9d519d1160d button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-73290a0e-3ea1-49af-9b91-e9d519d1160d');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "     int_long_passing  int_ball_control  int_dribbling  CAM  CB  CDM  CF  CM  \\\n",
       "0                  91                96             96    0   0    0   0   0   \n",
       "33                 77                92             88    0   0    0   0   0   \n",
       "57                 40                30             12    0   0    0   0   0   \n",
       "121                81                95             95    0   0    0   0   0   \n",
       "89                 93                92             88    1   0    0   0   0   \n",
       "\n",
       "     GK  LB  LM  LW  LWB  RB  RM  RW  RWB  ST  \n",
       "0     0   0   0   0    0   0   0   1    0   0  \n",
       "33    0   0   0   0    0   0   0   0    0   1  \n",
       "57    1   0   0   0    0   0   0   0    0   0  \n",
       "121   0   0   0   1    0   0   0   0    0   0  \n",
       "89    0   0   0   0    0   0   0   0    0   0  "
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Sort to define the order of nodes\n",
    "sorted_df = fifa_df.sort_values(by=\"int_player_id\")\n",
    "# Select node features\n",
    "node_features = sorted_df[[\"str_positions\", \"int_long_passing\", \"int_ball_control\", \"int_dribbling\"]]\n",
    "# Convert non-numeric columns\n",
    "pd.set_option('mode.chained_assignment', None)\n",
    "positions = node_features[\"str_positions\"].str.split(\",\", expand=True)\n",
    "node_features[\"first_position\"] = positions[0]\n",
    "# One-hot encoding\n",
    "node_features = pd.concat([node_features, pd.get_dummies(node_features[\"first_position\"])], axis=1, join='inner')\n",
    "node_features.drop([\"str_positions\", \"first_position\"], axis=1, inplace=True)\n",
    "node_features.head() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AR3zsmvjet1a"
   },
   "source": [
    "That's already our node feature matrix. The number of nodes and the ordering is implicitly defined by it's shape. Each row corresponds to one node in our final graph. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ApwpEKX9egmE",
    "outputId": "57658fe1-e514-4ce8-80b9-bf3b079b7569"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(18767, 18)"
      ]
     },
     "execution_count": 106,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert to numpy\n",
    "x = node_features.to_numpy()\n",
    "x.shape # [num_nodes x num_features]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MfLt4Xn0bk_-"
   },
   "source": [
    "`Step 5`: Extract the labels\n",
    "\n",
    "Those are simply the ratings of each of the players. This corresponds to a node-level prediction problem. Therefore we have as many labels as we have nodes. Of course it can happen that we don't have labels for all nodes and in this case it makes sense to define masks using Pytorch Geometric's helper functions: [here](https://pytorch-geometric.readthedocs.io/en/latest/_modules/torch_geometric/utils/mask.html). I also quickly talk about this [in this video](https://www.youtube.com/watch?v=ex2qllcVneY&ab_channel=DeepFindr) at around 08:30."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "xXnZRCSTbhD2",
    "outputId": "894f241f-5dd1-49ae-a12b-61bb25bbdd73"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-135ca79d-217d-4645-829f-8681d7fc3e9f\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>int_overall</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>84</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>57</th>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>121</th>\n",
       "      <td>83</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-135ca79d-217d-4645-829f-8681d7fc3e9f')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-135ca79d-217d-4645-829f-8681d7fc3e9f button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-135ca79d-217d-4645-829f-8681d7fc3e9f');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "     int_overall\n",
       "0             84\n",
       "33            83\n",
       "57            83\n",
       "121           83\n",
       "89            85"
      ]
     },
     "execution_count": 107,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Sort to define the order of nodes\n",
    "sorted_df = fifa_df.sort_values(by=\"int_player_id\")\n",
    "# Select node features\n",
    "labels = sorted_df[[\"int_overall\"]]\n",
    "labels.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "WuggdIItffpv",
    "outputId": "8b63222b-64fb-47b3-d05a-499e23bc1513"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(18767, 1)"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert to numpy\n",
    "y = labels.to_numpy()\n",
    "y.shape # [num_nodes, 1] --> node regression"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dKP82s3iYP2j"
   },
   "source": [
    "`Step 6`: Extract the edges\n",
    "\n",
    "That's probably the trickiest part with a tabular dataset. You need to think of a reasonable way to connect your nodes. As mentioned previously, we will use the team assignment here.\n",
    "\n",
    "\n",
    "> **AGAIN: There are many ways to connect the entities in a dataset and this approach is very trivial (as it will lead to disconnected subgraphs). If I wanted to build a real model from this dataset, I would probably look for a more sophisticated way to connect the players. Using a GNN is a bit overkill for the way I model the edges.**\n",
    "\n",
    "\n",
    "We now need to find the pairs of players that are assigned to the same team.\n",
    "Let's first check how many players per team we have."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "e4drJB5c3XXw"
   },
   "outputs": [],
   "source": [
    "# Remap player IDs\n",
    "fifa_df[\"int_player_id\"] = fifa_df.reset_index().index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "NWyO0PRWYSUb",
    "outputId": "3a8cc144-1917-4065-97a8-8e7ceb03bf7f"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Everton                   36\n",
       "Valencia CF               34\n",
       "FC Nantes                 34\n",
       "Villarreal CF             34\n",
       "Real Valladolid CF        34\n",
       "                          ..\n",
       "Wellington Phoenix        19\n",
       "Central Coast Mariners    19\n",
       "Melbourne Victory         19\n",
       "Brisbane Roar             19\n",
       "Adelaide United           19\n",
       "Name: str_team_name, Length: 681, dtype: int64"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This tells us how many players per team we have to connect\n",
    "fifa_df[\"str_team_name\"].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-O5ueZ8pozZO"
   },
   "source": [
    "We now need to build all permutations of these players within one team, which corresponds to a fully-connected graph within each team-subgroup. We use the column int_player_id as indices for the edges. If there is for example a [0, 1] in the edge index, it means that the first and second node (regarding the previously defined node feature matrix) are connected.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "uk2_4e67omHF",
    "outputId": "270f8de7-a8df-46bd-9961-cb3bb3e82d19"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[    0,     0,     0, ..., 18704, 18704, 18719],\n",
       "       [    7,    32,    45, ..., 18719, 18751, 18751]])"
      ]
     },
     "execution_count": 111,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import itertools\n",
    "import numpy as np\n",
    "\n",
    "teams = fifa_df[\"str_team_name\"].unique()\n",
    "all_edges = np.array([], dtype=np.int32).reshape((0, 2))\n",
    "for team in teams:\n",
    "    team_df = fifa_df[fifa_df[\"str_team_name\"] == team]\n",
    "    players = team_df[\"int_player_id\"].values\n",
    "    # Build all combinations, as all players are connected\n",
    "    permutations = list(itertools.combinations(players, 2))\n",
    "    edges_source = [e[0] for e in permutations]\n",
    "    edges_target = [e[1] for e in permutations]\n",
    "    team_edges = np.column_stack([edges_source, edges_target])\n",
    "    all_edges = np.vstack([all_edges, team_edges])\n",
    "# Convert to Pytorch Geometric format\n",
    "edge_index = all_edges.transpose()\n",
    "edge_index # [2, num_edges]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hwfhBfUQaanJ"
   },
   "source": [
    "The result are these source/target edge pairs. Here you can also model dircted or undirected edges by inluding both or just one direction (I included both). This COO format is usually chosen as it is more efficient than a *NxN* adjacency matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2ZNHC00TYScj"
   },
   "source": [
    "`Step 7`: Build the dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2VSU3rpk3ZNn"
   },
   "source": [
    "Now we have all the components we need to build a graph for libraries like Pytorch Geometric or DGL. I won't install these libraries here, as this will make the notebook too bulky, but here are some code snippets for the final steps.\n",
    "\n",
    "\n",
    "We need to pass the numpy arrays to the Data object, like this. If you have further attributes like edge_features, you can also pass them here.\n",
    "```\n",
    "from torch_geometric.data import Data\n",
    "data = Data(x=x, edge_index=edge_index, y=y)\n",
    "```\n",
    "\n",
    "This data object represents one single graph.\n",
    "\n",
    "\n",
    "Typically several graphs are combined in a dataset object. For this please refer to [the documentation](https://pytorch-geometric.readthedocs.io/en/latest/notes/create_dataset.html) or [this video](https://www.youtube.com/watch?v=QLIkOtKS4os&ab_channel=DeepFindr).\n",
    "Other than that, you can also quickly build a dataloader as follows. Just create a list of all your graphs and pass them to the Pytorch Geometric dataloader.\n",
    "\n",
    "```\n",
    "from torch_geometric.loader import DataLoader\n",
    "data_list = [Data(...), ..., Data(...)]\n",
    "loader = DataLoader(data_list, batch_size=32)\n",
    "```\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "R7OBoSXFQQsJ"
   },
   "source": [
    "## 1.2 Heterogeneous"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qKn7sdYfQQsL"
   },
   "source": [
    "`Example 2 / Step 3`\n",
    "\n",
    "To make it as realistic as possible, I selected a random dataset I found on the internet that contains heterogeneous nodes. Recommender systems are a classical example for this and therefore I chose the [Anime Recommender Database](https://github.com/Mayank-Bhatia/Anime-Recommender) (a movie recommendation dataset).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "SVVSJq3LQQsM"
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# Download data (quietly)\n",
    "!wget -q https://raw.githubusercontent.com/Mayank-Bhatia/Anime-Recommender/master/data/anime.csv\n",
    "!wget -q https://raw.githubusercontent.com/Mayank-Bhatia/Anime-Recommender/master/data/rating.csv\n",
    "\n",
    "anime = pd.read_csv(\"anime.csv\")\n",
    "rating = pd.read_csv(\"rating.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ljgXqQRsfqNs"
   },
   "source": [
    "We have two tables - one that contains information about the movies (anime.csv) and another one (rating.csv) that describes how the users rated the movies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "AbMrkuUYQ0GO",
    "outputId": "39f7f36a-b389-4d9c-e17a-f1ae7f4354f4"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-e11943fd-26b9-4876-b389-e74d0bcdf96f\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>anime_id</th>\n",
       "      <th>name</th>\n",
       "      <th>genre</th>\n",
       "      <th>type</th>\n",
       "      <th>episodes</th>\n",
       "      <th>rating</th>\n",
       "      <th>members</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>32281</td>\n",
       "      <td>Kimi no Na wa.</td>\n",
       "      <td>Drama, Romance, School, Supernatural</td>\n",
       "      <td>Movie</td>\n",
       "      <td>1</td>\n",
       "      <td>9.37</td>\n",
       "      <td>200630</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5114</td>\n",
       "      <td>Fullmetal Alchemist: Brotherhood</td>\n",
       "      <td>Action, Adventure, Drama, Fantasy, Magic, Mili...</td>\n",
       "      <td>TV</td>\n",
       "      <td>64</td>\n",
       "      <td>9.26</td>\n",
       "      <td>793665</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>28977</td>\n",
       "      <td>Gintama°</td>\n",
       "      <td>Action, Comedy, Historical, Parody, Samurai, S...</td>\n",
       "      <td>TV</td>\n",
       "      <td>51</td>\n",
       "      <td>9.25</td>\n",
       "      <td>114262</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>9253</td>\n",
       "      <td>Steins;Gate</td>\n",
       "      <td>Sci-Fi, Thriller</td>\n",
       "      <td>TV</td>\n",
       "      <td>24</td>\n",
       "      <td>9.17</td>\n",
       "      <td>673572</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9969</td>\n",
       "      <td>Gintama&amp;#039;</td>\n",
       "      <td>Action, Comedy, Historical, Parody, Samurai, S...</td>\n",
       "      <td>TV</td>\n",
       "      <td>51</td>\n",
       "      <td>9.16</td>\n",
       "      <td>151266</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e11943fd-26b9-4876-b389-e74d0bcdf96f')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-e11943fd-26b9-4876-b389-e74d0bcdf96f button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-e11943fd-26b9-4876-b389-e74d0bcdf96f');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   anime_id                              name  \\\n",
       "0     32281                    Kimi no Na wa.   \n",
       "1      5114  Fullmetal Alchemist: Brotherhood   \n",
       "2     28977                          Gintama°   \n",
       "3      9253                       Steins;Gate   \n",
       "4      9969                     Gintama&#039;   \n",
       "\n",
       "                                               genre   type episodes  rating  \\\n",
       "0               Drama, Romance, School, Supernatural  Movie        1    9.37   \n",
       "1  Action, Adventure, Drama, Fantasy, Magic, Mili...     TV       64    9.26   \n",
       "2  Action, Comedy, Historical, Parody, Samurai, S...     TV       51    9.25   \n",
       "3                                   Sci-Fi, Thriller     TV       24    9.17   \n",
       "4  Action, Comedy, Historical, Parody, Samurai, S...     TV       51    9.16   \n",
       "\n",
       "   members  \n",
       "0   200630  \n",
       "1   793665  \n",
       "2   114262  \n",
       "3   673572  \n",
       "4   151266  "
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "anime.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "iXofUdWHSOts",
    "outputId": "435fd289-4ef0-4e42-bab6-2ab60a8c580d"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-42bb4c92-dcbe-4884-95db-6bbb3d19e687\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>anime_id</th>\n",
       "      <th>rating</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>20</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>24</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>79</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>226</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>241</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-42bb4c92-dcbe-4884-95db-6bbb3d19e687')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-42bb4c92-dcbe-4884-95db-6bbb3d19e687 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-42bb4c92-dcbe-4884-95db-6bbb3d19e687');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   user_id  anime_id  rating\n",
       "0        1        20      -1\n",
       "1        1        24      -1\n",
       "2        1        79      -1\n",
       "3        1       226      -1\n",
       "4        1       241      -1"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rating.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FTgGxZv6QQsO"
   },
   "source": [
    "Let's just like before first identify the graph entities we need.\n",
    "\n",
    "- `Nodes` - Users and Animes (two node types with different features = heterogeneous)\n",
    "- `Edges` - If a user has rated a movie / the rating (edge weight)\n",
    "- `Node Features` - The movie attributes and for the users we have no explicit features so we have to figure something out later \n",
    "- `Labels` - The rating for a movie (link prediction regression task)\n",
    "\n",
    "This dataset will, just like `Example 1` lead to one single graph that contains all nodes and edges. Given a pair of node and anime movie, we will be able to predict if / how the user likes this movie. To model this as a graph, we will have to support two node types: movie and user. That's because they have different node features (and shapes) that would not fit into one joint node feature matrix.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "DkdUba-gQQsS"
   },
   "source": [
    "`Preprocessing one single graph...`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gOirHOe3QQsS"
   },
   "source": [
    "`Step 4`: Extract the node features\n",
    "\n",
    "Each of the movies occurs only once in the anime dataframe and hence we can directly extract the features from there. If you have multiple entries for each node (movie ID) in your dataframe please have a look at the remarks at 1.1. (step 4).\n",
    "\n",
    "We will just extract the columns with specific attributes and convert them to numeric features..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7b2SyuToYEd4"
   },
   "source": [
    "For the anime movies ...\n",
    "- First we need to do a re-mapping of the IDs. That's because they don't start with 0 and also not all IDs are present. That's however important because the edge_index is always referring to the index in the node feature matrix\n",
    "- We will store this mapping because we will need it later"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 392
    },
    "id": "ZZLAP581QQsT",
    "outputId": "c82edd3f-8eff-4d0a-9926-d27ff2f257b9"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-6ca01da3-a01d-4ce7-bdfb-dc5512db4d17\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>type</th>\n",
       "      <th>episodes</th>\n",
       "      <th>Action</th>\n",
       "      <th>Adventure</th>\n",
       "      <th>Cars</th>\n",
       "      <th>Comedy</th>\n",
       "      <th>Dementia</th>\n",
       "      <th>Demons</th>\n",
       "      <th>Drama</th>\n",
       "      <th>Ecchi</th>\n",
       "      <th>...</th>\n",
       "      <th>Supernatural</th>\n",
       "      <th>Thriller</th>\n",
       "      <th>Vampire</th>\n",
       "      <th>Yaoi</th>\n",
       "      <th>Movie</th>\n",
       "      <th>Music</th>\n",
       "      <th>ONA</th>\n",
       "      <th>OVA</th>\n",
       "      <th>Special</th>\n",
       "      <th>TV</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>TV</td>\n",
       "      <td>26</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Movie</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>TV</td>\n",
       "      <td>26</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>TV</td>\n",
       "      <td>26</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>TV</td>\n",
       "      <td>52</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>TV</td>\n",
       "      <td>145</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>TV</td>\n",
       "      <td>24</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>TV</td>\n",
       "      <td>52</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>TV</td>\n",
       "      <td>24</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>TV</td>\n",
       "      <td>74</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10 rows × 48 columns</p>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6ca01da3-a01d-4ce7-bdfb-dc5512db4d17')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-6ca01da3-a01d-4ce7-bdfb-dc5512db4d17 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-6ca01da3-a01d-4ce7-bdfb-dc5512db4d17');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "    type episodes  Action  Adventure  Cars  Comedy  Dementia  Demons  Drama  \\\n",
       "0     TV       26       1          0     0       0         0       0      0   \n",
       "1  Movie        1       1          0     0       0         0       0      0   \n",
       "2     TV       26       1          0     0       0         0       0      0   \n",
       "3     TV       26       1          0     0       0         0       0      0   \n",
       "4     TV       52       0          1     0       0         0       0      0   \n",
       "5     TV      145       1          0     0       0         0       0      0   \n",
       "6     TV       24       0          0     0       1         0       0      0   \n",
       "7     TV       52       0          0     0       1         0       0      0   \n",
       "8     TV       24       1          0     0       0         0       0      0   \n",
       "9     TV       74       0          0     0       0         0       0      1   \n",
       "\n",
       "   Ecchi  ...  Supernatural  Thriller  Vampire  Yaoi  Movie  Music  ONA  OVA  \\\n",
       "0      0  ...             0         0        0     0      0      0    0    0   \n",
       "1      0  ...             0         0        0     0      1      0    0    0   \n",
       "2      0  ...             0         0        0     0      0      0    0    0   \n",
       "3      0  ...             0         0        0     0      0      0    0    0   \n",
       "4      0  ...             0         0        0     0      0      0    0    0   \n",
       "5      0  ...             0         0        0     0      0      0    0    0   \n",
       "6      0  ...             0         0        0     0      0      0    0    0   \n",
       "7      0  ...             0         0        0     0      0      0    0    0   \n",
       "8      0  ...             0         0        0     0      0      0    0    0   \n",
       "9      0  ...             0         0        0     0      0      0    0    0   \n",
       "\n",
       "   Special  TV  \n",
       "0        0   1  \n",
       "1        0   0  \n",
       "2        0   1  \n",
       "3        0   1  \n",
       "4        0   1  \n",
       "5        0   1  \n",
       "6        0   1  \n",
       "7        0   1  \n",
       "8        0   1  \n",
       "9        0   1  \n",
       "\n",
       "[10 rows x 48 columns]"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Sort to define the order of nodes\n",
    "sorted_df = anime.sort_values(by=\"anime_id\").set_index(\"anime_id\")\n",
    "\n",
    "# Map IDs to start from 0\n",
    "sorted_df = sorted_df.reset_index(drop=False)\n",
    "movie_id_mapping = sorted_df[\"anime_id\"]\n",
    "\n",
    "# Select node features\n",
    "node_features = sorted_df[[\"type\", \"genre\", \"episodes\"]]\n",
    "# Convert non-numeric columns\n",
    "pd.set_option('mode.chained_assignment', None)\n",
    "\n",
    "# For simplicity I'll just select the first genre here and ignore the others\n",
    "genres = node_features[\"genre\"].str.split(\",\", expand=True)\n",
    "node_features[\"main_genre\"] = genres[0]\n",
    "\n",
    "# One-hot encoding\n",
    "anime_node_features = pd.concat([node_features, pd.get_dummies(node_features[\"main_genre\"])], axis=1, join='inner')\n",
    "anime_node_features = pd.concat([anime_node_features, pd.get_dummies(anime_node_features[\"type\"])], axis=1, join='inner')\n",
    "anime_node_features.drop([\"genre\", \"main_genre\"], axis=1, inplace=True)\n",
    "anime_node_features.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "jFTNBTkVQQsV",
    "outputId": "51b28eae-9bb6-47a6-b337-19ae74ae14e8"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(12294, 48)"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert to numpy\n",
    "x = anime_node_features.to_numpy()\n",
    "x.shape # [num_movie_nodes x movie_node_feature_dim]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "oa9rhAzhYHLi"
   },
   "source": [
    "For the users ...\n",
    "\n",
    "Here we are missing a dataframe that describes the attributes of each user. As a workaround we have different options:\n",
    "- Either we insert dummies (for example random values between 0 and 1 like [0, 0.5, 0.1, 1]), which will then be updated through message passing \n",
    "- Or we calculate some stats about the users, such as average rating, number of ratings, ... (based on the rating dataframe)\n",
    "- Or we use typical characteristics of each node as features (degree, neighborhood, or even Node2Vec embedding)\n",
    "\n",
    "\n",
    "We will go with the second option here.\n",
    "\n",
    "```\n",
    "Important: If you calculate statistics based on the dataframe, always make sure to first split into test and train set \n",
    "in order to leak no information into the dataset! (I didn't do this here)\n",
    "```\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "oDSi7tdVYJUi",
    "outputId": "739a6f00-2b2c-4d90-c157-04ff0558fad8"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-d7ed8c82-8783-447b-8ae5-d0010f9f3338\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-0.712418</td>\n",
       "      <td>153</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2.666667</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>7.382979</td>\n",
       "      <td>94</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-1.000000</td>\n",
       "      <td>52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4.263383</td>\n",
       "      <td>467</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-d7ed8c82-8783-447b-8ae5-d0010f9f3338')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-d7ed8c82-8783-447b-8ae5-d0010f9f3338 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-d7ed8c82-8783-447b-8ae5-d0010f9f3338');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "       mean  count\n",
       "0 -0.712418    153\n",
       "1  2.666667      3\n",
       "2  7.382979     94\n",
       "3 -1.000000     52\n",
       "4  4.263383    467"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Find out mean rating and number of ratings per user\n",
    "mean_rating = rating.groupby(\"user_id\")[\"rating\"].mean().rename(\"mean\")\n",
    "num_rating = rating.groupby(\"user_id\")[\"rating\"].count().rename(\"count\")\n",
    "user_node_features = pd.concat([mean_rating, num_rating], axis=1)\n",
    "\n",
    "# Remap user ID (to start at 0)\n",
    "user_node_features = user_node_features.reset_index(drop=False)\n",
    "user_id_mapping = user_node_features[\"user_id\"]\n",
    "\n",
    "# Only keep features \n",
    "user_node_features = user_node_features[[\"mean\", \"count\"]]\n",
    "user_node_features.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "3JSpuMIYbKWQ",
    "outputId": "14e8e0c6-4ecb-4b86-8620-ccf2e60767d7"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(73515, 2)"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert to numpy\n",
    "x = user_node_features.to_numpy()\n",
    "x.shape # [num_user_nodes x user_node_feature_dim]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4XchA6Z8QQsU"
   },
   "source": [
    "Those are already our node feature matrices. We could of course also normalize the values to be in the range of (0,1). \n",
    "\n",
    "For the movies we have clear attributes that describe each node. For the users we have calculated some basic properties that provide information about the rating behavior.\n",
    "\n",
    "The number of nodes and the ordering is implicitly defined by their shape. Each row corresponds to one node in our final graph. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UefEW40zQQsV"
   },
   "source": [
    "`Step 5`: Extract the labels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kmHVNWozbYFp"
   },
   "source": [
    "In this example, we have a link prediction / regression problem and thus the labels are the edges. The plot below shows the distribution of the ratings. Later the task will be to predict the ratings between a user and movie.\n",
    "\n",
    "Unlike in `Example 1` the labels are now equal to the number of edges. \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "mDnbUgMDfyZq",
    "outputId": "01143be6-2ce5-4f68-f844-6bd997cf8a38"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-31780fc3-27ed-449d-805d-f93febc3ef51\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>anime_id</th>\n",
       "      <th>rating</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>20</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>24</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>79</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>226</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>241</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-31780fc3-27ed-449d-805d-f93febc3ef51')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-31780fc3-27ed-449d-805d-f93febc3ef51 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-31780fc3-27ed-449d-805d-f93febc3ef51');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   user_id  anime_id  rating\n",
       "0        1        20      -1\n",
       "1        1        24      -1\n",
       "2        1        79      -1\n",
       "3        1       226      -1\n",
       "4        1       241      -1"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rating.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 293
    },
    "id": "fFtCkSyHQQsW",
    "outputId": "c79a9402-231a-4ab5-965f-d2b3cf5e7b2e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f86b7276090>"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEDCAYAAAAlRP8qAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAPaUlEQVR4nO3dfWxdd33H8fdnSWEl3sKmMK9rujnaAgg1YlALOpCQA0MKDyL7o2NFrBBUlmmiPClsBP4AiX/WSQMNxJMi6EIZq2EFbVlbwVDBKkyjalI60qSDZaVAQmn6ACkuFRDx3R++3azg+F7b5/rGP94vyfI95/zuOd9v7vUnx8fnnJuqQpK09v3SqAuQJHXDQJekRhjoktQIA12SGmGgS1IjDHRJasRIAz3JNUlOJrlzwPEvT3I0yZEk/zjs+iRpLckoz0NP8jxgFri2qi7uM3Yr8Cng+VX1/SS/UVUnV6NOSVoLRrqHXlW3AA/Nn5fkd5N8NsmhJF9K8tTeoj8DPlBV3+891zCXpHnOxWPo+4DXV9UlwFuAD/bmPxl4cpJ/T/KVJDtGVqEknYPWj7qA+ZKMAc8B/inJY7Mf3/u+HtgKTAGbgVuSbKuqH6x2nZJ0LjqnAp253xh+UFW/v8Cy48CtVfVT4JtJvsFcwN+2mgVK0rnqnDrkUlUPMxfWfwyQOU/vLf5n5vbOSbKJuUMwd4+iTkk6F436tMXrgP8AnpLkeJIrgVcCVyb5T+AIsLM3/HPAg0mOAl8E/rKqHhxF3ZJ0LhrpaYuSpO6cU4dcJEnLN7I/im7atKkmJiZGtfkleeSRR9iwYcOoyxiKlnuDtvuzt7VrJf0dOnTogap60kLLRhboExMTHDx4cFSbX5KZmRmmpqZGXcZQtNwbtN2fva1dK+kvybfOtsxDLpLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1Ihz7X7okrQqJvbeOLJt798xnNsauIcuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSI/oGepKLknwxydEkR5K8cYExSfK+JMeSfC3JM4dTriTpbAb5gIvTwJ6quj3JrwCHkny+qo7OG/MiYGvv69nAh3rfJUmrpO8eelXdW1W39x7/ELgLuPCMYTuBa2vOV4AnJrmg82olSWeVqhp8cDIB3AJcXFUPz5t/A3B1VX25N30z8NaqOnjG83cDuwHGx8cvmZ6eXmn9q2J2dpaxsbFRlzEULfcGbfdnbytz+MSpoa5/MVs2rlt2f9u3bz9UVZMLLRv4M0WTjAGfBt40P8yXoqr2AfsAJicna2pqajmrWXUzMzOslVqXquXeoO3+7G1ldo34M0WH0d9AZ7kkOY+5MP9EVX1mgSEngIvmTW/uzZMkrZJBznIJ8FHgrqp6z1mGHQBe1Tvb5VLgVFXd22GdkqQ+Bjnk8lzgCuBwkjt6894O/DZAVX0YuAl4MXAM+BHwmu5LlSQtpm+g9/7QmT5jCnhdV0VJkpbOK0UlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUiPWjLkDSL7aJvTf+3Lw9206za4H5Wpx76JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1Ii+gZ7kmiQnk9x5luVTSU4luaP39Y7uy5Qk9TPIpf/7gfcD1y4y5ktV9dJOKpIkLUvfPfSqugV4aBVqkSStQKqq/6BkArihqi5eYNkU8GngOPBd4C1VdeQs69kN7AYYHx+/ZHp6erl1r6rZ2VnGxsZGXcZQtNwbtN1fK70dPnHq5+aNnw/3PTqCYlbJlo3rlv3abd++/VBVTS60rItA/1XgZ1U1m+TFwHuramu/dU5OTtbBgwf7bvtcMDMzw9TU1KjLGIqWe4O2+2ult7PdbfHdh9u9Gez+HRuW/dolOWugr/hfrKoenvf4piQfTLKpqh5Y6brPZqE3wDDNv5XnPVe/ZFW3LUmDWvFpi0l+M0l6j5/VW+eDK12vJGlp+u6hJ7kOmAI2JTkOvBM4D6CqPgxcBvxFktPAo8DlNchxHElSp/oGelW9os/y9zN3WqMkaYS8UlSSGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1ot1rayUNbLWvvtZwuIcuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1Ij+gZ6kmuSnExy51mWJ8n7khxL8rUkz+y+TElSP4Psoe8Hdiyy/EXA1t7XbuBDKy9LkrRUfQO9qm4BHlpkyE7g2przFeCJSS7oqkBJ0mBSVf0HJRPADVV18QLLbgCurqov96ZvBt5aVQcXGLubub14xsfHL5menl5W0YdPnFrW85Zr/Hy479G5x9su3Liq2x622dlZxsbGRl3G0LTcX5e9rfbPVD/zf+ZatGXjumW/dtu3bz9UVZMLLVu/oqqWqKr2AfsAJicna2pqalnr2bX3xg6r6m/PttO8+/DcP9U9r5xa1W0P28zMDMt9HdaClvvrsrfV/pnqZ/7PXIv279gwlPdlF2e5nAAumje9uTdPkrSKugj0A8Creme7XAqcqqp7O1ivJGkJ+v5Ok+Q6YArYlOQ48E7gPICq+jBwE/Bi4BjwI+A1wypWknR2fQO9ql7RZ3kBr+usIknSsnilqCQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakRBrokNcJAl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhqxftQFSPp/E3tvHHjsnm2n2bWE8Wqfe+iS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktSIgQI9yY4kX09yLMneBZbvSnJ/kjt6X6/tvlRJ0mL6XvqfZB3wAeCFwHHgtiQHquroGUM/WVVXDaFGSdIABtlDfxZwrKrurqqfANPAzuGWJUlaqlTV4gOSy4AdVfXa3vQVwLPn740n2QX8NXA/8A3gzVX1nQXWtRvYDTA+Pn7J9PT0soo+fOLUsp63XOPnw32Pzj3eduHGVd32sM3OzjI2NjbqMoZmrfW3lPf2/Pdla1ruDWDLxnXLfl9u3779UFVNLrSsq7st/itwXVX9OMmfAx8Dnn/moKraB+wDmJycrKmpqWVtbLXvMLdn22nefXjun+qeV06t6raHbWZmhuW+DmvBWutvKe/t+e/L1rTcG8D+HRuG8r4c5JDLCeCiedObe/P+T1U9WFU/7k1+BLikm/IkSYMaJNBvA7Ym2ZLkccDlwIH5A5JcMG/yZcBd3ZUoSRpE399pqup0kquAzwHrgGuq6kiSdwEHq+oA8IYkLwNOAw8Bu4ZYsyRpAQMdpKqqm4Cbzpj3jnmP3wa8rdvSJElL4ZWiktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhphoEtSIwx0SWqEgS5JjTDQJakR7X7Gk7QCE6v8MYdSF9xDl6RGGOiS1AgDXZIaYaBLUiMMdElqhIEuSY0w0CWpEQa6JDXCQJekRhjoktQIA12SGmGgS1IjDHRJaoSBLkmNMNAlqREGuiQ1wg+40Dmriw+Z2LPtNLv8sAr9gnAPXZIaYaBLUiMMdElqhIEuSY0w0CWpEZ7lor66ONtE0vANtIeeZEeSryc5lmTvAssfn+STveW3JpnoulBJ0uL67qEnWQd8AHghcBy4LcmBqjo6b9iVwPer6veSXA78DfAnwyh41FrbW/U8bakdg+yhPws4VlV3V9VPgGlg5xljdgIf6z2+HnhBknRXpiSpn1TV4gOSy4AdVfXa3vQVwLOr6qp5Y+7sjTnem/6f3pgHzljXbmB3b/IpwNe7amTINgEP9B21NrXcG7Tdn72tXSvp73eq6kkLLVjVP4pW1T5g32puswtJDlbV5KjrGIaWe4O2+7O3tWtY/Q1yyOUEcNG86c29eQuOSbIe2Ag82EWBkqTBDBLotwFbk2xJ8jjgcuDAGWMOAK/uPb4M+EL1O5YjSepU30MuVXU6yVXA54B1wDVVdSTJu4CDVXUA+Cjw8STHgIeYC/2WrLnDREvQcm/Qdn/2tnYNpb++fxSVJK0NXvovSY0w0CWpEQZ6H/1ue7BWJbkoyReTHE1yJMkbR11T15KsS/LVJDeMupauJXlikuuT/FeSu5L8wahr6kqSN/fek3cmuS7JL4+6ppVIck2Sk73rdR6b9+tJPp/kv3vff62LbRnoi5h324MXAU8DXpHkaaOtqjOngT1V9TTgUuB1DfX2mDcCd426iCF5L/DZqnoq8HQa6TPJhcAbgMmqupi5EzHW+kkW+4EdZ8zbC9xcVVuBm3vTK2agL26Q2x6sSVV1b1Xd3nv8Q+YC4cLRVtWdJJuBlwAfGXUtXUuyEXgec2eXUVU/qaofjLaqTq0Hzu9d0/IE4LsjrmdFquoW5s7+m2/+7VI+BvxRF9sy0Bd3IfCdedPHaSj0HtO7O+YzgFtHW0mn/g74K+Bnoy5kCLYA9wN/3zuk9JEkG0ZdVBeq6gTwt8C3gXuBU1X1b6OtaijGq+re3uPvAeNdrNRA/wWXZAz4NPCmqnp41PV0IclLgZNVdWjUtQzJeuCZwIeq6hnAI3T0K/uo9Y4l72TuP63fAjYk+dPRVjVcvYswOzl/3EBf3CC3PVizkpzHXJh/oqo+M+p6OvRc4GVJ7mHuMNnzk/zDaEvq1HHgeFU99hvV9cwFfAv+EPhmVd1fVT8FPgM8Z8Q1DcN9SS4A6H0/2cVKDfTFDXLbgzWpd3vjjwJ3VdV7Rl1Pl6rqbVW1uaommHvNvlBVzezlVdX3gO8keUpv1guAo4s8ZS35NnBpkif03qMvoJE/+J5h/u1SXg38Sxcr9SPoFnG22x6MuKyuPBe4Ajic5I7evLdX1U0jrEmDez3wid6Oxt3Aa0ZcTyeq6tYk1wO3M3cm1ldZ47cBSHIdMAVsSnIceCdwNfCpJFcC3wJe3sm2vPRfktrgIRdJaoSBLkmNMNAlqREGuiQ1wkCXpEYY6JLUCANdkhrxv6WaKquJv3ibAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# -1 means the user watched but didn't assign a weight\n",
    "rating[\"rating\"].hist()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "7Ry02FIsdC4W"
   },
   "source": [
    "As you can see (below), we don't have all of the movies in the rating table (which is natural, because we usually don't have ratings for all items). This means, we don't have labels for all user-item pairs, but only a subset. \n",
    "\n",
    "To consider this in the loss calculation, we can simply store a mask of the indices that are available. Previously I also quickly talked about masks, in the node-level prediction case. It is exactly the same here - we just want to perform predictions for the entities for which we have a label. \n",
    "\n",
    "As we have an edge-prediction problem here, we implicitly stored this mask already as edge_index. For each edge we know the label and therefore we only have to calculate the loss based on the edges we know. Later at inference time, we can also predict the edge attributes (labels) for other node pairs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "hVuxPsUQcyAY",
    "outputId": "1a8268e2-11c2-4fd7-a560-1e8327745ad2"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([   20,    24,    79, ..., 29481, 34412, 30738])"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Movies that are part of our rating matrix\n",
    "rating[\"anime_id\"].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "zHg1gdAHdG_E",
    "outputId": "cf25a9ae-7309-4ddf-b497-08824ba13564"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([    1,     5,     6, ..., 34522, 34525, 34527])"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# All movie IDs (e.g. no rating above for 1, 5, 6...)\n",
    "anime[\"anime_id\"].sort_values().unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ki7Pk8CYurac",
    "outputId": "39f5c520-d29b-4c39-f9b2-77e40ed36d7b"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{30913, 30924, 20261}\n"
     ]
    }
   ],
   "source": [
    "# We can also see that there are some movies in the rating matrix, for which we have no features (we will drop them here)\n",
    "print(set(rating[\"anime_id\"].unique()) - set(anime[\"anime_id\"].unique()))\n",
    "rating = rating[~rating[\"anime_id\"].isin([30913, 30924, 20261])]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "yAm5Ml5Aftsz",
    "outputId": "e6003070-9fc9-45f0-bd25-46b1ff009809"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7813732     7\n",
       "7813733     9\n",
       "7813734    10\n",
       "7813735     9\n",
       "7813736     9\n",
       "Name: rating, dtype: int64"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Extract labels\n",
    "labels = rating[\"rating\"]\n",
    "labels.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IlLh1xo2QQsX",
    "outputId": "54136fb9-076a-4835-dc01-77fb2e896c31"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(7813727,)"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Convert to numpy\n",
    "y = labels.to_numpy()\n",
    "y.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "E1irlsOvQQsY"
   },
   "source": [
    "`Step 6`: Extract the edges\n",
    "\n",
    "In this example, the edges are already implicitly provided by the rating matrix. The important part however is that we need to use the remappings from before to align the IDs of the dataframes.\n",
    "\n",
    "For each entry in the matrix, we have exactly one edge, between the user_id and the anime_id. Therefore, the edge index is exactly the same as what we have calculated in the cell above. \n",
    "\n",
    "The edge index can later be used in the model to mask out all edges for which we have no information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 223
    },
    "id": "T_XpRMnGvrZH",
    "outputId": "a7e74c28-f12d-42ee-92f1-692dee404016"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before remapping...\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-d1ffb9d4-c44e-4b7b-93ee-1942719fa965\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>anime_id</th>\n",
       "      <th>rating</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>20</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>24</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>79</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>226</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>241</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-d1ffb9d4-c44e-4b7b-93ee-1942719fa965')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-d1ffb9d4-c44e-4b7b-93ee-1942719fa965 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-d1ffb9d4-c44e-4b7b-93ee-1942719fa965');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   user_id  anime_id  rating\n",
       "0        1        20      -1\n",
       "1        1        24      -1\n",
       "2        1        79      -1\n",
       "3        1       226      -1\n",
       "4        1       241      -1"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "print(\"Before remapping...\")\n",
    "rating.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 223
    },
    "id": "dR89IN3RvbcT",
    "outputId": "20dd23c9-3b84-484b-8ae9-40d6ed89e140"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "After remapping...\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-fdc2f720-a364-4255-ba7d-7db8c49e1a86\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>anime_id</th>\n",
       "      <th>rating</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>10</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>14</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>58</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>202</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>217</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-fdc2f720-a364-4255-ba7d-7db8c49e1a86')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-fdc2f720-a364-4255-ba7d-7db8c49e1a86 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-fdc2f720-a364-4255-ba7d-7db8c49e1a86');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   user_id  anime_id  rating\n",
       "0        0        10      -1\n",
       "1        0        14      -1\n",
       "2        0        58      -1\n",
       "3        0       202      -1\n",
       "4        0       217      -1"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Map anime IDs \n",
    "movie_map = movie_id_mapping.reset_index().set_index(\"anime_id\").to_dict()\n",
    "rating[\"anime_id\"] = rating[\"anime_id\"].map(movie_map[\"index\"]).astype(int)\n",
    "# Map user IDs\n",
    "user_map = user_id_mapping.reset_index().set_index(\"user_id\").to_dict()\n",
    "rating[\"user_id\"] = rating[\"user_id\"].map(user_map[\"index\"]).astype(int)\n",
    "\n",
    "print(\"After remapping...\")\n",
    "rating.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "XbAsReH_QQsY",
    "outputId": "6d97d199-8f76-4a18-8e55-d5ce61498fc7"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[    0,     0,     0, ..., 73513, 73514, 73514],\n",
       "       [   10,    14,    58, ...,  8624,   718,  5226]])"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "edge_index = rating[[\"user_id\", \"anime_id\"]].values.transpose()\n",
    "edge_index # [2 x num_edges] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZeZgk4thQQsb"
   },
   "source": [
    "`Step 7`: Build the dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JT4bIKi1QQsc"
   },
   "source": [
    "Now we have all the components we need to build a graph for libraries like Pytorch Geometric or DGL. I won't install these libraries here, as this will make the notebook too bulky, but here are some code snippets for the final steps.\n",
    "\n",
    "For Heterogenous Graphs we need to store the individual matrices in `HeteroData` objects, which can hold multiple node/edge matrices. There is also a great tutorial for [heterogenous graphs in PyG](https://pytorch-geometric.readthedocs.io/en/latest/notes/heterogeneous.html#creating-heterogeneous-graphs)\n",
    "\n",
    "```\n",
    "from torch_geometric.data import HeteroData\n",
    "data = HeteroData()\n",
    "data['user'].x = user_node_features\n",
    "data['movie'].x = anime_node_features\n",
    "```\n",
    "\n",
    "If you have different edge types between the nodes, you can also consider this here. In the example above we only have one type, therefore the edge_index looks like this (a triplet):\n",
    "\n",
    "```\n",
    "data['user', 'rating', movie'].edge_index = edge_index\n",
    "```\n",
    "Finally, we can add the labels of the link-prediction setup like this. In Heterogenous graphs you can also have different labels between different entities, but here we just have one type. If you build recommender systems specifically, then you might also find this tutorial on [bipartite grphs](https://pytorch-geometric.readthedocs.io/en/latest/notes/batching.html#bipartite-graphs) helpful.\n",
    "\n",
    "\n",
    "```\n",
    "data['user', 'movie'].y = y\n",
    "```\n",
    "\n",
    "For more information on how to handle heterogenous graphs, please refer to the documentation linked above.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nLCEck1Q_WRM"
   },
   "source": [
    "# 2. Tabular dataset -> Temporal Graph dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "j2OfHi-IAJ0p"
   },
   "source": [
    "In your tabular dataset identify the following:\n",
    "\n",
    "- Nodes (Items, People, Locations, ...)\n",
    "- Edges (Connections, Interactions, Correlations, ...)\n",
    "- Node Features (Attributes)\n",
    "- Labels (Node-level, edge-level, graph-level)\n",
    "- Timesteps (Are they already defined? What is one timestep?) --> e.g. 60 min Interval\n",
    "- Temporal graph shape: static or dynamic? / What is changing over time?\n",
    "\n",
    "and optionally:\n",
    "- Edge weights (Strength of the connection, number of interactions, ...)\n",
    "- Edge features (Additional properties describing the edge)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "uObAqRq4j-DV"
   },
   "source": [
    "`Example 3 / Step 3`\n",
    "\n",
    "I selected a random time-series dataset from the internet, which we try to convert to a temporal graph dataset. The dataset contains the [trip history of bikers in New York City](https://ride.citibikenyc.com/system-data). Luckily the start end end locations already have an ID, otherwise we would have to map the addresses to locations. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 276
    },
    "id": "vwzwDuXl_Zl5",
    "outputId": "59f34e46-c85b-40ae-cd39-b0389eba5f30"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Archive:  201306-citibike-tripdata.zip\n",
      "  inflating: 201306-citibike-tripdata.csv  \n",
      "   creating: __MACOSX/\n",
      "  inflating: __MACOSX/._201306-citibike-tripdata.csv  \n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-3438a2f8-b3de-49a6-8a0b-82bd166365fe\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>tripduration</th>\n",
       "      <th>starttime</th>\n",
       "      <th>stoptime</th>\n",
       "      <th>start station id</th>\n",
       "      <th>start station name</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>end station name</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>bikeid</th>\n",
       "      <th>usertype</th>\n",
       "      <th>birth year</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>695</td>\n",
       "      <td>2013-06-01 00:00:01</td>\n",
       "      <td>2013-06-01 00:11:36</td>\n",
       "      <td>444</td>\n",
       "      <td>Broadway &amp; W 24 St</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>434.0</td>\n",
       "      <td>9 Ave &amp; W 18 St</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>19678</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>693</td>\n",
       "      <td>2013-06-01 00:00:08</td>\n",
       "      <td>2013-06-01 00:11:41</td>\n",
       "      <td>444</td>\n",
       "      <td>Broadway &amp; W 24 St</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>434.0</td>\n",
       "      <td>9 Ave &amp; W 18 St</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>16649</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1984.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2059</td>\n",
       "      <td>2013-06-01 00:00:44</td>\n",
       "      <td>2013-06-01 00:35:03</td>\n",
       "      <td>406</td>\n",
       "      <td>Hicks St &amp; Montague St</td>\n",
       "      <td>40.695128</td>\n",
       "      <td>-73.995951</td>\n",
       "      <td>406.0</td>\n",
       "      <td>Hicks St &amp; Montague St</td>\n",
       "      <td>40.695128</td>\n",
       "      <td>-73.995951</td>\n",
       "      <td>19599</td>\n",
       "      <td>Customer</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>123</td>\n",
       "      <td>2013-06-01 00:01:04</td>\n",
       "      <td>2013-06-01 00:03:07</td>\n",
       "      <td>475</td>\n",
       "      <td>E 15 St &amp; Irving Pl</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>262.0</td>\n",
       "      <td>Washington Park</td>\n",
       "      <td>40.691782</td>\n",
       "      <td>-73.973730</td>\n",
       "      <td>16352</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1960.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1521</td>\n",
       "      <td>2013-06-01 00:01:22</td>\n",
       "      <td>2013-06-01 00:26:43</td>\n",
       "      <td>2008</td>\n",
       "      <td>Little West St &amp; 1 Pl</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>310.0</td>\n",
       "      <td>State St &amp; Smith St</td>\n",
       "      <td>40.689269</td>\n",
       "      <td>-73.989129</td>\n",
       "      <td>15567</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-3438a2f8-b3de-49a6-8a0b-82bd166365fe')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-3438a2f8-b3de-49a6-8a0b-82bd166365fe button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-3438a2f8-b3de-49a6-8a0b-82bd166365fe');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   tripduration            starttime             stoptime  start station id  \\\n",
       "0           695  2013-06-01 00:00:01  2013-06-01 00:11:36               444   \n",
       "1           693  2013-06-01 00:00:08  2013-06-01 00:11:41               444   \n",
       "2          2059  2013-06-01 00:00:44  2013-06-01 00:35:03               406   \n",
       "3           123  2013-06-01 00:01:04  2013-06-01 00:03:07               475   \n",
       "4          1521  2013-06-01 00:01:22  2013-06-01 00:26:43              2008   \n",
       "\n",
       "       start station name  start station latitude  start station longitude  \\\n",
       "0      Broadway & W 24 St               40.742354               -73.989151   \n",
       "1      Broadway & W 24 St               40.742354               -73.989151   \n",
       "2  Hicks St & Montague St               40.695128               -73.995951   \n",
       "3     E 15 St & Irving Pl               40.735243               -73.987586   \n",
       "4   Little West St & 1 Pl               40.705693               -74.016777   \n",
       "\n",
       "   end station id        end station name  end station latitude  \\\n",
       "0           434.0         9 Ave & W 18 St             40.743174   \n",
       "1           434.0         9 Ave & W 18 St             40.743174   \n",
       "2           406.0  Hicks St & Montague St             40.695128   \n",
       "3           262.0         Washington Park             40.691782   \n",
       "4           310.0     State St & Smith St             40.689269   \n",
       "\n",
       "   end station longitude  bikeid    usertype  birth year  gender  \n",
       "0             -74.003664   19678  Subscriber      1983.0       1  \n",
       "1             -74.003664   16649  Subscriber      1984.0       1  \n",
       "2             -73.995951   19599    Customer         NaN       0  \n",
       "3             -73.973730   16352  Subscriber      1960.0       1  \n",
       "4             -73.989129   15567  Subscriber      1983.0       1  "
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "!wget -q http://s3.amazonaws.com/tripdata/201306-citibike-tripdata.zip\n",
    "!unzip -o 201306-citibike-tripdata.zip\n",
    "\n",
    "trips = pd.read_csv(\"201306-citibike-tripdata.csv\")  \n",
    "trips.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "I-w__1dW_Bib",
    "outputId": "f1e8c5c2-dd08-41d0-ac02-dceb887db8b3"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-c7bb43a9-1f1f-41fe-9896-4cb1673077b9\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>tripduration</th>\n",
       "      <th>starttime</th>\n",
       "      <th>stoptime</th>\n",
       "      <th>start station id</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>bikeid</th>\n",
       "      <th>usertype</th>\n",
       "      <th>birth year</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>695</td>\n",
       "      <td>2013-06-01 00:00:01</td>\n",
       "      <td>2013-06-01 00:11:36</td>\n",
       "      <td>444</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>434.0</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>19678</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>693</td>\n",
       "      <td>2013-06-01 00:00:08</td>\n",
       "      <td>2013-06-01 00:11:41</td>\n",
       "      <td>444</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>434.0</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>16649</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1984.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>123</td>\n",
       "      <td>2013-06-01 00:01:04</td>\n",
       "      <td>2013-06-01 00:03:07</td>\n",
       "      <td>475</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>262.0</td>\n",
       "      <td>40.691782</td>\n",
       "      <td>-73.973730</td>\n",
       "      <td>16352</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1960.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1521</td>\n",
       "      <td>2013-06-01 00:01:22</td>\n",
       "      <td>2013-06-01 00:26:43</td>\n",
       "      <td>2008</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>310.0</td>\n",
       "      <td>40.689269</td>\n",
       "      <td>-73.989129</td>\n",
       "      <td>15567</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2057</td>\n",
       "      <td>2013-06-01 00:02:33</td>\n",
       "      <td>2013-06-01 00:36:50</td>\n",
       "      <td>285</td>\n",
       "      <td>40.734546</td>\n",
       "      <td>-73.990741</td>\n",
       "      <td>532.0</td>\n",
       "      <td>40.710451</td>\n",
       "      <td>-73.960876</td>\n",
       "      <td>15693</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1991.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c7bb43a9-1f1f-41fe-9896-4cb1673077b9')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-c7bb43a9-1f1f-41fe-9896-4cb1673077b9 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-c7bb43a9-1f1f-41fe-9896-4cb1673077b9');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   tripduration            starttime             stoptime  start station id  \\\n",
       "0           695  2013-06-01 00:00:01  2013-06-01 00:11:36               444   \n",
       "1           693  2013-06-01 00:00:08  2013-06-01 00:11:41               444   \n",
       "3           123  2013-06-01 00:01:04  2013-06-01 00:03:07               475   \n",
       "4          1521  2013-06-01 00:01:22  2013-06-01 00:26:43              2008   \n",
       "6          2057  2013-06-01 00:02:33  2013-06-01 00:36:50               285   \n",
       "\n",
       "   start station latitude  start station longitude  end station id  \\\n",
       "0               40.742354               -73.989151           434.0   \n",
       "1               40.742354               -73.989151           434.0   \n",
       "3               40.735243               -73.987586           262.0   \n",
       "4               40.705693               -74.016777           310.0   \n",
       "6               40.734546               -73.990741           532.0   \n",
       "\n",
       "   end station latitude  end station longitude  bikeid    usertype  \\\n",
       "0             40.743174             -74.003664   19678  Subscriber   \n",
       "1             40.743174             -74.003664   16649  Subscriber   \n",
       "3             40.691782             -73.973730   16352  Subscriber   \n",
       "4             40.689269             -73.989129   15567  Subscriber   \n",
       "6             40.710451             -73.960876   15693  Subscriber   \n",
       "\n",
       "   birth year  gender  \n",
       "0      1983.0       1  \n",
       "1      1984.0       1  \n",
       "3      1960.0       1  \n",
       "4      1983.0       1  \n",
       "6      1991.0       1  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Remove columns we don't need / rows with Nan\n",
    "cols_to_drop = [\"start station name\", \"end station name\", ]\n",
    "trips.dropna(inplace=True)\n",
    "trips.drop(cols_to_drop, axis=1, inplace=True)\n",
    "trips.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "8mosGnR3i2LE",
    "outputId": "de2e967b-0ec7-44b5-9c50-e0f7b6d38198"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-2256d2b2-63d6-46de-b525-c27383454dc8\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>tripduration</th>\n",
       "      <th>starttime</th>\n",
       "      <th>stoptime</th>\n",
       "      <th>start station id</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>bikeid</th>\n",
       "      <th>usertype</th>\n",
       "      <th>birth year</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>695</td>\n",
       "      <td>2013-06-01 00:00:01</td>\n",
       "      <td>2013-06-01 00:11:36</td>\n",
       "      <td>0</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>299</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>19678</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>693</td>\n",
       "      <td>2013-06-01 00:00:08</td>\n",
       "      <td>2013-06-01 00:11:41</td>\n",
       "      <td>0</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>299</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>16649</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1984.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>123</td>\n",
       "      <td>2013-06-01 00:01:04</td>\n",
       "      <td>2013-06-01 00:03:07</td>\n",
       "      <td>1</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>187</td>\n",
       "      <td>40.691782</td>\n",
       "      <td>-73.973730</td>\n",
       "      <td>16352</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1960.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1521</td>\n",
       "      <td>2013-06-01 00:01:22</td>\n",
       "      <td>2013-06-01 00:26:43</td>\n",
       "      <td>2</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>257</td>\n",
       "      <td>40.689269</td>\n",
       "      <td>-73.989129</td>\n",
       "      <td>15567</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2057</td>\n",
       "      <td>2013-06-01 00:02:33</td>\n",
       "      <td>2013-06-01 00:36:50</td>\n",
       "      <td>3</td>\n",
       "      <td>40.734546</td>\n",
       "      <td>-73.990741</td>\n",
       "      <td>280</td>\n",
       "      <td>40.710451</td>\n",
       "      <td>-73.960876</td>\n",
       "      <td>15693</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1991.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2256d2b2-63d6-46de-b525-c27383454dc8')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-2256d2b2-63d6-46de-b525-c27383454dc8 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-2256d2b2-63d6-46de-b525-c27383454dc8');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   tripduration            starttime             stoptime  start station id  \\\n",
       "0           695  2013-06-01 00:00:01  2013-06-01 00:11:36                 0   \n",
       "1           693  2013-06-01 00:00:08  2013-06-01 00:11:41                 0   \n",
       "3           123  2013-06-01 00:01:04  2013-06-01 00:03:07                 1   \n",
       "4          1521  2013-06-01 00:01:22  2013-06-01 00:26:43                 2   \n",
       "6          2057  2013-06-01 00:02:33  2013-06-01 00:36:50                 3   \n",
       "\n",
       "   start station latitude  start station longitude  end station id  \\\n",
       "0               40.742354               -73.989151             299   \n",
       "1               40.742354               -73.989151             299   \n",
       "3               40.735243               -73.987586             187   \n",
       "4               40.705693               -74.016777             257   \n",
       "6               40.734546               -73.990741             280   \n",
       "\n",
       "   end station latitude  end station longitude  bikeid    usertype  \\\n",
       "0             40.743174             -74.003664   19678  Subscriber   \n",
       "1             40.743174             -74.003664   16649  Subscriber   \n",
       "3             40.691782             -73.973730   16352  Subscriber   \n",
       "4             40.689269             -73.989129   15567  Subscriber   \n",
       "6             40.710451             -73.960876   15693  Subscriber   \n",
       "\n",
       "   birth year  gender  \n",
       "0      1983.0       1  \n",
       "1      1984.0       1  \n",
       "3      1960.0       1  \n",
       "4      1983.0       1  \n",
       "6      1991.0       1  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Reassign the location IDs (makes it easier later, because here the IDs didn't start at 0)\n",
    "locations = trips['start station id'].unique()\n",
    "new_ids = list(range(len(trips['start station id'].unique())))\n",
    "mapping = dict(zip(locations, new_ids))\n",
    "\n",
    "trips['start station id'] = trips['start station id'].map(mapping)\n",
    "trips['end station id'] = trips['end station id'].map(mapping)\n",
    "trips.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MwVpsPRU-0ht"
   },
   "source": [
    "**What could we use this dataset for?**\n",
    "\n",
    "We could use it for example to predict the trip duration between two locations, based on the surrounding (structural) and temporal trip durations. Another possibility is to predict the bike traffic in terms of how many bikers we expect on a route. \n",
    "\n",
    "- `Nodes` - The locations between which the bikers can travel\n",
    "- `Node Features` - Attributes about a location, for example average number of bikers that start / end here, average traffic, ... (whatever is available)\n",
    "\n",
    "For this dataset, we don't have edge (adjacency) information yet. This is often the difficult part with graph datasets. For traffic networks, you typically connect the nodes according to their closeness or even the underlying street network, i.e. if there is a direct road between two locations. Here we have the latitude/longitude for each of the addresses and with that we are able to calculate the distances. \n",
    "\n",
    "- `Edges / Edge weights` - Proximity between two addresses (available) / Connection according to road network (not available here)\n",
    "- `Edge features` - To incorporate the individual bike information, we can simply model this as edge features i.e. number of bikes, average trip duration, ...\n",
    "- `Labels` - The trip duration, which makes it a link-prediction task \n",
    "\n",
    "The more or less difficult part with temporal graph datasets, is to define a stepsize for the temporal snapshots. In the dataset above we have random trips from one location to another, without a pre-defined discrete interval. \n",
    "An easy approach is for example to define X minute intervals and build a graph out of all trips that happened (ended) in this timeframe. The \"ended\" part is important, because we don't want to leak information into the future timesteps.\n",
    "\n",
    "- `Timesteps` - 60 min steps (we have 1 month of data = ~ 700 graphs)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "S-5rJdQy-0hu"
   },
   "source": [
    "`Building the temporal dataset ...`\n",
    "\n",
    "First make sure the dataset is sorted in time, so that we can iterate over the 60-min intervals. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "ovjWWRTmF1lm",
    "outputId": "1f1b5e2f-6111-477a-8106-48e7a0cd17b7"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-277625e6-0676-470d-a36b-81311f56d500\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>tripduration</th>\n",
       "      <th>starttime</th>\n",
       "      <th>stoptime</th>\n",
       "      <th>start station id</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>bikeid</th>\n",
       "      <th>usertype</th>\n",
       "      <th>birth year</th>\n",
       "      <th>gender</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>695</td>\n",
       "      <td>2013-06-01 00:00:01</td>\n",
       "      <td>2013-06-01 00:11:36</td>\n",
       "      <td>0</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>299</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>19678</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>693</td>\n",
       "      <td>2013-06-01 00:00:08</td>\n",
       "      <td>2013-06-01 00:11:41</td>\n",
       "      <td>0</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>299</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>-74.003664</td>\n",
       "      <td>16649</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1984.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>123</td>\n",
       "      <td>2013-06-01 00:01:04</td>\n",
       "      <td>2013-06-01 00:03:07</td>\n",
       "      <td>1</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>187</td>\n",
       "      <td>40.691782</td>\n",
       "      <td>-73.973730</td>\n",
       "      <td>16352</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1960.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1521</td>\n",
       "      <td>2013-06-01 00:01:22</td>\n",
       "      <td>2013-06-01 00:26:43</td>\n",
       "      <td>2</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>257</td>\n",
       "      <td>40.689269</td>\n",
       "      <td>-73.989129</td>\n",
       "      <td>15567</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1983.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2057</td>\n",
       "      <td>2013-06-01 00:02:33</td>\n",
       "      <td>2013-06-01 00:36:50</td>\n",
       "      <td>3</td>\n",
       "      <td>40.734546</td>\n",
       "      <td>-73.990741</td>\n",
       "      <td>280</td>\n",
       "      <td>40.710451</td>\n",
       "      <td>-73.960876</td>\n",
       "      <td>15693</td>\n",
       "      <td>Subscriber</td>\n",
       "      <td>1991.0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-277625e6-0676-470d-a36b-81311f56d500')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-277625e6-0676-470d-a36b-81311f56d500 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-277625e6-0676-470d-a36b-81311f56d500');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   tripduration            starttime             stoptime  start station id  \\\n",
       "0           695  2013-06-01 00:00:01  2013-06-01 00:11:36                 0   \n",
       "1           693  2013-06-01 00:00:08  2013-06-01 00:11:41                 0   \n",
       "3           123  2013-06-01 00:01:04  2013-06-01 00:03:07                 1   \n",
       "4          1521  2013-06-01 00:01:22  2013-06-01 00:26:43                 2   \n",
       "6          2057  2013-06-01 00:02:33  2013-06-01 00:36:50                 3   \n",
       "\n",
       "   start station latitude  start station longitude  end station id  \\\n",
       "0               40.742354               -73.989151             299   \n",
       "1               40.742354               -73.989151             299   \n",
       "3               40.735243               -73.987586             187   \n",
       "4               40.705693               -74.016777             257   \n",
       "6               40.734546               -73.990741             280   \n",
       "\n",
       "   end station latitude  end station longitude  bikeid    usertype  \\\n",
       "0             40.743174             -74.003664   19678  Subscriber   \n",
       "1             40.743174             -74.003664   16649  Subscriber   \n",
       "3             40.691782             -73.973730   16352  Subscriber   \n",
       "4             40.689269             -73.989129   15567  Subscriber   \n",
       "6             40.710451             -73.960876   15693  Subscriber   \n",
       "\n",
       "   birth year  gender  \n",
       "0      1983.0       1  \n",
       "1      1984.0       1  \n",
       "3      1960.0       1  \n",
       "4      1983.0       1  \n",
       "6      1991.0       1  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trips = trips.sort_values(by=\"starttime\")\n",
    "trips.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kBG1M1tPGe2G"
   },
   "source": [
    "Now we want to iterate over the dataframe and select all trips that fall into each of the 60 min intervals. \n",
    "\n",
    "Just a quick check - how many trips per bucket will we end up with in the end? This defines for how many edges we have information in each individual graph\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 411
    },
    "id": "0E-JPOHwGeP5",
    "outputId": "a3c83c71-39d6-42c4-9ac9-4c4772be0e38"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f15e93ed3d0>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABKAAAAF5CAYAAACoUFL8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzdfXScdZ3//9dMMpkkk0mnuWmb3qQt0WBWFroU7FJt5VvodxHdX4/ITUVWXa0HrYAHF6yLCv0CiqUcXVla6+/3ZXEXXCuIWBAEq2zFpXJTW0C0LlKBFNrSNO10mvtJZn5/JDOdJDOTubnuZq7n4xzPkUyT+cw1n+tzfa739f68P554PB4XAAAAAAAAYBKv3Q0AAAAAAABAeSMABQAAAAAAAFMRgAIAAAAAAICpCEABAAAAAADAVASgAAAAAAAAYCoCUAAAAAAAADAVASgAAAAAAACYqtLuBtjp2LFexWJxu5tRlMbGOnV399jdDJQQ+gzyRZ9BvugzyBd9BvmizyBf9Bnkiz5TGK/Xo+nTA2lfc3UAKhaLl3wASlJZfAZYiz6DfNFnkC/6DPJFn0G+6DPIF30G+aLPGIsleAAAAAAAADAVASgAAAAAAACYigAUAAAAAAAATEUACgAAAAAAAKYiAAUAAAAAAABTEYACAAAAAACAqQhAAQAAAAAAwFSVVrzJsWPH9KUvfUmdnZ2qqqrS/PnzdfPNN6uhoUGnnnqq2tvb5fWOxsJuv/12nXrqqZKkJ598UrfffrtGRkb07ne/W7fddptqamqmfA0AAAAAAADOYUkGlMfj0Zo1a/TEE0/okUce0bx583THHXckX9+6dau2bdumbdu2JYNPvb29+trXvqYtW7Zo+/btCgQCuvvuu6d8DQAAAAAAAM5iSQAqFAppyZIlyf9etGiRDhw4kPV3nnrqKZ122mlasGCBJGn16tX6+c9/PuVrAAAAAABYxiNF+qPq7OpVZGBY8tjdIMCZLFmClyoWi+mHP/yhVqxYkfzZP/zDP2hkZETLly/X1VdfraqqKh08eFCzZ89O/pvZs2fr4MGDkpT1NQAAAAAALOGR9nYe1533v6DB6Ij8vgpdc+kidbROk+J2Nw5wFssDULfccotqa2t1xRVXSJJ27NihlpYW9fT06Prrr9emTZt07bXXWtKWxsY6S97HbM3NQbubgBJDn0G+6DPIF30G+aLPIF/0GeTLjD7z1uGeZPBJkgajI7rz/hf0nS+eqzkzyuN+080YZ4xlaQBqw4YNeuONN7Rly5Zk0fGWlhZJUl1dnS655BLdc889yZ8/++yzyd89cOBA8t9mey0f3d09isVKOyzd3BxUV9cJu5uBEkKfQb7oM8gXfQb5os8gX/QZ5MusPnPoSG8y+JQwGB3Roe4eVXlK+17T7RhnCuP1ejIm+1hSA0qSvvWtb+nll1/Wpk2bVFVVJUk6fvy4BgYGJEnDw8N64okn1NHRIUlatmyZfv/73+v111+XNFqo/AMf+MCUrwEAAAAAYIVQ0C+/r2Lcz/y+CoUCVTa1CHAuSzKg/vznP+t73/ueFixYoNWrV0uS5s6dqzVr1ujGG2+Ux+PR8PCw/uZv/kZf+MIXJI1mRN1888268sorFYvF1NHRoa985StTvgYAAAAAgBXqayp1zaWLJtWAqq/1UQMKmMATj8dde1qwBA9uRJ9BvugzyBd9BvmizyBf9Bnky9Q+45EifVGFe4cUClQRfCoTjDOFybYEz/Ii5AAAAAAAlI24VF/jU32NL/nfACazrAYUAAAAAAAA3IkAFAAAAAAAAExFAAoAAAAAAACmIgAFAAAAAAAAUxGAAgAAAAAAgKkIQAEAAAAAAMBUBKAAAAAAAABgKgJQAAAAAAAAMBUBKAAAAAAAAJiKABQAAAAAAABMRQAKAAAAAAAApiIABQAAAAAoPR4p0h9VZ1evIgPDksfuBgHIptLuBgAAAAAAkBePtLfzuO68/wUNRkfk91XomksXqaN1mhS3u3EA0iEDCgAAAABQUiJ90WTwSZIGoyO68/4XFOmL2twyAJkQgAIAAAAAlJRwz1Ay+JQwGB1RuHfIphYBmAoBKAAAAABASQkF/fL7Ksb9zO+rUChQZVOLAEyFABQAAAAAoKTU11TqmksXJYNQiRpQ9bU+m1sGIBOKkAMAAAAASktc6midpg1rlyrcO6RQoGo0+EQBcsCxCEABAAAAAEpPXKqv8am+xpf8bwDOxRI8AAAAAAAAmIoAFAAAAAAAAExFAAoAAAAAAACmIgAFAAAAAAAAUxGAAgAAAAAAgKkIQAEAAAAAAMBUBKAAAAAAAABgKgJQAAAAAAAAMBUBKAAAAAAAAJiKABQAAAAAAABMRQAKAAAAAAAApiIABQAAAAAAAFMRgAIAAAAAAICpCEABAAAAAADAVASgAAAAAAAAYCoCUAAAAAAAADAVASgAAAAAAGAMjxTpj6qzq1eRgWHJY3eD4BSVdjcAAAAAAACUAY+0t/O47rz/BQ1GR+T3VeiaSxepo3WaFLe7cbAbGVAAAAAAAKBokb5oMvgkSYPREd15/wuK9EVtbhmcgAAUAAAAACA/LLNCGuGeoWTwKWEwOqJw75BNLYKTsAQPAAAAwGSe0WyGcM+QQkG/6msqWUKDUSyzQgahoF9+X8W4IJTfV6FQoMrGVsEpyIACAAAAMN5YgGHd5p1af/ezWrfpae3tPE6WCySxzAqZ1ddU6ppLF8nvq5CkZHCyvtZnc8vgBGRAAQAAABgnU4Bhw9qlqq/hRtLtsi2zon+42FjW5LSAT7deeY56B6IKBapGg09kxkEEoAAAAABMQIAB2bDMCpNkWJbZ2hwg+IQkluABAAAAGCcRYEhFgAEJLLPCRCzLRC7IgAIAAAAwTiLAMDGbgaU0kCTFpY7WadqwdqnCvUMsswJZk8gJASgAAAAA4xFgwFTiUn2N72Rwgb7haizLRC5YggcAAABgsrEAQ2tTYDTIQIABQAYsy0QuyIACAAAAAACFI2sSOSAABQAAAAAAisOyTEyBJXgAAAAAABTCI0X6o+rs6lVkYFjy2N0gwLksyYA6duyYvvSlL6mzs1NVVVWaP3++br75ZjU0NOiFF17QjTfeqMHBQc2ZM0cbN25UY2OjJBX8GgAAAAAApvJIezuPT9otsqN1Gtk/QBqWZEB5PB6tWbNGTzzxhB555BHNmzdPd9xxh2KxmK6//nrdeOONeuKJJ3TWWWfpjjvukKSCXwMAAAAAwGyRvmgy+CRJg9ER3Xn/C4r0RW1uGeBMlgSgQqGQlixZkvzvRYsW6cCBA3r55Zfl9/t11llnSZJWr16txx9/XJIKfg0AAAAAALOFe4aSwaeEweiIwr1DNrUIcDbLa0DFYjH98Ic/1IoVK3Tw4EHNnj07+VpDQ4NisZjC4XDBrwEAAAAAYLZQ0C+/r2Lcz/y+CoUCVTa1CHA2y3fBu+WWW1RbW6srrrhC27dvt/rtx2lsrLP1/Y3S3By0uwkoMfQZ5Is+g3zRZ5Av+gzyRZ9BvozuM42xuK796Jn69g93J2tAXfvRM7Vw7nR5vVQjLweMM8ayNAC1YcMGvfHGG9qyZYu8Xq9aWlp04MCB5OtHjx6V1+tVKBQq+LV8dHf3KBYr7epwzc1BdXWdsLsZKCH0GeSLPoN80WeQL/oM8kWfQb7M6jPtc4LasHapwr1DCgWqVF/rU3d3j+HvA+sxzhTG6/VkTPaxbAnet771Lb388svatGmTqqpGUxJPO+00DQwMaNeuXZKkrVu36oILLijqNQAAAAAALBGX6mt8am0KqL7Gx+53QBaWZED9+c9/1ve+9z0tWLBAq1evliTNnTtXmzZt0u23366bbrpJg4ODmjNnjjZu3ChJ8nq9Bb0GAAAAAAAAZ/HE43HXxmhZggc3os8gX/QZ5Is+g3zRZ5Av+gzyRZ9BvugzhXHEEjwAAAAAAAC4EwEoAAAAAAAAmIoAFAAAAAAAAExFAAoAAAAAULo8UqQ/qs6uXkUGhiWP3Q0CkI4lu+ABAAAAAGA4j7S387juvP8FDUZH5PdV6JpLF6mjdZpU2vtNAWWHDCgAAAAAQEmK9EWTwSdJGoyO6M77X1CkL5r/HyOTCjAVGVAAAAAAgJIU7hlKBp8SBqMjCvcOqb7Gl/sfIpMKMB0ZUAAAAACAkhQK+uX3VYz7md9XoVCgKq+/Y2gmFYC0CEABAAAAAEpSfU2lrrl0UTIIlchcqq/NI/tJ2TOpSgZLCOFwLMEDAAAAAJSmuNTROk0b1i5VuHdIoUDVaPApz2VziUyq1CBUIZlUtmEJIUoAGVAAAAAAgNIVl+prfGptCozWfSog4GJUJpVdWEKIUkAGFAAAAADA3YzIpPKMBoLCPUMKBf2qr6m0LPvIsGLsgIkIQAEAAAAAMJZJlQzY5Bl8snMJXMkvIYQrsAQPAAAAAIAi2L0ErtSXEMIdyIACAAAAnMbGpTwA8mf7EjiDirEDZiIABQAAADgJu1kBJccRS+CKWUIIWIAleAAAAICD2L2UB3A9jxTpj6qzq1eRgWHJM/WvsAQOmBoZUAAAAICD2L6UB3CzQjMQWQIHTIkMKAAAAMBBEkt5UrGbFcpaARlHZikqA3FsCVxrU2A0WEzwCRiHDCgAAADAQRJLeSZmYJBNgbLksJpnjshAZBMClCkCUAAAAICTsJQHLpIp42jD2qW2LDm1vZi4wwJygJFYggcAAAA4DUt54BLZMo7sYHcxcTYhQDkjAwoAAAAAkD8DlorZnnE0kc0ZiI5YAgiYhAAUAAAAACA/Bi0Vc2TNs7EMxGTAx8J2JAJywYBPKxa3Sh7J6/GoIei3rhEJ1KKCwQhAAQAAAADyYljtpkIzjso0OFJfU6nrLj9Tb3b1aOv2V5JBuXkz6qytA0UtKpiAGlAAAAAAgLwYWrsp35pnY8GRdZt3av3dz2rdpqe1t/O45Mn/rR0nLjWHqpPBJ8meOlDUooIZCEABAAAAAPKSWCqWyqraTeUeHHFCYXYntAHlhwAUAAAAACAvdu4WV+7BETuDe05qA8oPNaAAAAAAAPmxcbc4x+2cZzAnFGZ3QhtQfghAAQAAAADyZ9NucWUfHLExuOeoNqDsEIACAAAA4FzpdjuDfZyw+5wbgiM2BfcmtWFsSWW4Z0jyeMpmt0HYg9EbAAAAgDN5pT+8HtZdD7w4LtOlsaHO7pa509jucxMzjzpap9kShLI9QFPunPR9oyxQhBwAAACA83ik/V19yeCTdHK3s4NHem1unDuV++5zGI/vG0YjAAUAAADAcSJ9Ue19/Wja3c6Onui3qVXuVu67z2E8vm8YjQAUAAAAAMcJ9wwpFlfareAbgjU2tcrdErvPpSqn3ecwHt83jEYACgAAAIDjhIJ+/WbPm7psZXvyJtjvq9BVl5yhlqaAza1zp8Tuc6nfR3L3OZQdvm8YjSLkAAAAABynvqZSV1zQofse36tVy9vk9UodCxo0r7lWXq/H7ua5kxt2n8NJfN8wGAEoAAAAAM4zdvO77orF429+Y3Y3zOXYfc5d+L5hIAJQAAAAAJyJm1938YwWnw/3DCkU9Ku+ppLvHCgjBKAAAAAAZEZQAFbwSHs7j+vO+1/QYHQkWW+oo3Ua/Q0oEwSgAAAAAKRHUAAWifRFk/1MkgajI7rz/he0Ye3SkxlwAEoau+ABAAAASCtTUCDSF7W5ZSg34Z6hZD9LGIyOKNw7ZFOLABiNABQAAACAtAgKwCqhoF9+X8W4n/l9FQoFqmxqEQCj5RSAGhkZ0bp16zQ0xIUGAAAAcAuCArBKfU2lrrl0UbK/JZZ71tey/C4vHinSH1VnV68iA8OSx+4GASflVAOqoqJCTz/9tDweei8AAABQ9sYKj/f0R3XVJWforgdeHFcDqr7WRw0oGCsudbRO04a1SxXuHVIoUEU/yxc12+BwORch/8QnPqF//dd/1dVXXy2fjyg0AAAAUJYm3MS2NNbqhk+erXg8TlAA5opL9TW+k0XH6We580jdJ4Yo5A5HyzkAdd999+nIkSO655571NDQMC4baseOHWa0DQAAAIDFJhYeP9jdp298//mTN7EEBQBnGQsa7z98ImPNNgJQcIKcA1AbN240sx0AAABws7ElX+GeIYWCftXXVBLosEm2wuPcxALOkwgar3p/m/y+inHnLzXb4CQ5B6De8573mNkOAAAAuFQsFqduiYMkCo9zEwuUhkTQ+MldnbpsZbt+tP0V+2u28VABaeQcgPrOd76T8bUvfOELhjQGAAAA7nPwSC91SxwksRvZxIAgtZ8AZ0oEjY+EB/TYzte0anmbvF5p8akz1BissiX4xEMFpJNzAOrQoUPj/rurq0vPP/+8zj///Jx+f8OGDXriiSf01ltv6ZFHHlF7e7skacWKFaqqqpLf75ckXXfddVq2bJkk6YUXXtCNN96owcFBzZkzRxs3blRjY+OUrwEAAKB0HI30s+TLSdiNDCgpqUHjI+EBbXtqn665dJE9wSdNriPHQwUk5ByAuu222yb97KmnntKjjz6a0++fd955+vjHP66Pfexjk1678847kwGphFgspuuvv1633XabzjrrLG3evFl33HGHbrvttqyvAQAAoLQ01New5Mtp2I0MKB0OCxpTRw6ZeIv55fe973365S9/mdO/Peuss9TS0pLz33755Zfl9/t11llnSZJWr16txx9/fMrXAAAAUFpamgK65tJF8vsqJGn8ki8380iR/qg6u3oVGRiWPFP/CgCXGgsatzYFbN+tMrEkMBUPFSDlkQG1f//+cf/d39+vn/3sZ3kFlTK57rrrFI/HtXjxYn3xi19UfX29Dh48qNmzZyf/TUNDg2KxmMLhcNbXQqFQ0e0BAACAdbxej6Oe3jsCNVQAlCjqyCGTnANQK1eulMfjUTw+2mNqamrU0dGhb37zm0U14Ac/+IFaWlo0NDSkr3/967r55pt1xx13FPU3c9XYWGfJ+5ituTlodxNQYugzyBd9BvmizyBfzU1BNdvdCAd563BP2hoq3/niuZozozzmsMVinEG+6DPWaWyoU9vckI6e6FdDsEYtTQF5vaWXxkmfMVbOAag//elPpjQgkUFVVVWlyy+/XJ/73OeSPz9w4EDy3x09elRer1ehUCjra/no7u5RLFbaIdjm5qC6uk7Y3QyUEPoM8kWfQb7oM8gXfWayQ0d609ZQOdTdoypPac9fjUCfQb7oM9ar8kiz6qslxdXd3WN3c/JGnymM1+vJmOyTVw2o4eFhPf/88/rZz36mXbt2aXh4uKiG9fX16cSJ0S80Ho/rscceU0dHhyTptNNO08DAgHbt2iVJ2rp1qy644IIpXwMAAABKHTVUAADlJucMqH379ulzn/ucBgYG1NLSooMHD8rv92vLli1qa2ub8vdvvfVW/eIXv9CRI0f0j//4jwqFQtqyZYuuvvpqjYyMKBaLqa2tTTfddJMkyev16vbbb9dNN92kwcFBzZkzRxs3bpzyNQAAAKDUUUMFAFBuPPFEUacpfPzjH9fy5cv16U9/Wh7P6NrNu+++Wzt27NC9995raiPNwhI8uBF9BvmizyBf9Bnkiz6TgUeK9EUpzJ4GfQaTJM6XniGFgn7V11SOO1/oM8gXfaYw2Zbg5VUD6p577kkGnyTpE5/4hLZs2VJ8CwEAAACMN7aten2NL/nfJW+KIAFQEHaNBEpCzjWgZsyYoeeee27cz3bt2qUZM2YY3igAAAAAZWYsSLBu806tv/tZrdv0tPZ2HpdKb2MsOEykL5p218hIX9TmlgFIlXMG1LXXXqu1a9fq3HPP1ezZs3XgwAHt2LGD2ksAAAAAppQpSLBh7dKTWV5AAcI9Q2l3jQz3DtG3rEJ2I3KQcwDqvPPO009+8hP9/Oc/1+HDh/XOd75T11xzjRYuXGhm+wAAAACUAYIEMEti18jU/mXbrpFuDMSwBBI5yjkAJUkLFy7U2rVrzWoLAAAAgDLlqCAByopjdo10aSCG7EbkKucAVDgc1r/9279p79696uvrG/faD37wA8MbBgAAAKB8OCZIgPITlzpap2nD2qW27hrp1kAM2Y3IVc4BqH/6p3/S0NCQPvCBD6impsbMNgEAAAAoNw4JEqBMOWDXSLcGYshuRK5yDkDt2bNHzzzzjKqq6EQAAAAACuCAIAFgFrcGYshuRK5yDkCdeuqpOnTokFpbW81sDwAAAIBy5MbizHAV1wZiyG5EjrIGoH784x8n///f/u3fas2aNbrooovU1NQ07t9dfPHF5rQOAAAAQOlzaXFmuIybAzFkNyIHWQNQ27ZtG/ffM2fO1NNPPz3uZx6PhwAUAAAAgIzcWpwZLkQgBsgoawDq3nvvzeuP/e53v9PixYuLahAAAAAAAzlg6ZtbizMDsIADxjjkJucaULn4zGc+o927dxv5JwEAAAAUyiFL39xanLkkcPOOUuaQMQ658Rr5x+JxvmEAAADAKTItfYv0RS1tR6I4s99XIUnjizPDPmM37xvu+53++PoxPfOHQ9p/pM/gu0TAPE4Z45AbQzOgPB6PkX8OAAAAQBFSl741haq1YnGr5JF6B0esLY7s5uLMDhbpi+q+x/dq5ZL5+tH2V5IZJFddcobePT/E9wPHS4xxqeObJPUMRFne60CGBqAAAAAAOEdi6Vsw4NOFSxcmgwzbfr3P+mUqFGd2nHDPkJYtmpvsF9JoBsldD7xIgXiUhFDQr5bG2klB1Hkz6jS7oZZxxmFIrgQAAACcwiNF+qPq7OpVZGA4+TS/UImlb+efPX9SkIFlKggF/fJ6lbFAPEqYwWOJU9XXVOqzF52eNojK+OY8OWVAxeNxvfnmm5o9e7YqKiqy/jsAAAAABTCjmO7Y0jd/VQW70GGS+ppKdSxooEB8uXFDYe6x4vk9/VENDccY30pEThlQHo9Hf//3fz9ljac9e/YY0igAAADAbUwrphuXmqZVJwuAJxBkgOLSvOZaXXXJGRSILyNlX5g7pXj+K28e1743jzO+lYica0B1dHTotddeU1tbm5ntAQAAAFwptWB4glFP8RNL8SZmRFAIHIpJ754fokB8GTFzLHGCRIBt1fI2/Wj7KwoGfLpsZfu4GlC2jG9jWVnhniGFgn7V11RyHk2QcwDqPe95jz7zmc/owx/+sGbNmjUuG+riiy82pXEAAACAWyQKhpuyFIpd6JANBeLLiqljiQMkA2ye0cDaYHhEj+18TauWt0ke6fR3NGn29GrLg09lv+zRADkHoHbv3q05c+boueeeG/dzj8dDAAoAAAAokulZSqUeZEjJLhiKe1TlVel9BsAC5Z7xmAiwSUoG2o6EB3T/r15RS2OtlvzVTB04NqCBwWE1Tau2JBMp07JHdpMcL+cA1L333mtmOwAAAAB3I0spM7ILgNyV+ViSCLDd9/jecUvvWhprdcUH3qWXXj2irROW45k9VpT7skej5ByAkqRjx47p17/+tY4cOaI1a9bo7bffVjwe16xZs8xqHwAAAOAepZ6lZBKyC4A8lfNYMhZgW3fFYvUMRHXrleeodyCqQLVPv/ufLj2041XLx4pyX/ZolJx2wZOk5557ThdccIEeeeQRbdq0SZL0xhtvaP369Wa1DQAAAACyZhcAcKGxANvs6bVqrKtSa1NAvf1RxeJxW8aKRFYWu0lml3MG1De+8Q39y7/8i8455xydffbZkqQzzjhDL730kmmNAwAAAACyCwBMJRT0y+vx2DNWlPmyR6PknAH11ltv6ZxzzpGk5A54Pp9PIyMj2X4NAAAAAIpCdgGAqdTXVKptdr1Wr2yX31ehplC1Vq88VddctkjyeCSPyQ0Yy8pqbQqMLvcj+DRJzhlQbW1t+s1vfqNly5Ylf7Zz5061t7eb0jAAAAAAkDQpu2BWY52qvHFu8ACcFJfaZgc1s6FG71rYoLe7+/TdB19i4wIHyTkA9eUvf1lXXnmlzj33XA0MDOjGG2/Uk08+qc2bN5vZPgAAAAAYV1S5ublOXV0n7G4RAKeJS3X+SsVi8WTwSWLjAqfIOQC1aNEiPfzww3r44Yf1kY98RC0tLfrxj3/MDngAAAAAAMAxEhsXNIWqtWJxa3L5Xc9AlACUjXIOQEnSzJkztWbNGh07dkzTp09P1oICAAAAAABwglDQr5bGWq1cMl8/2v5KchnevBl1mt1QyzI8m+RchDwSiej666/X6aefrve+9706/fTTdf311yscDpvZPgAAALiFR4r0R9XZ1avIwLD5BWMBAGWpvqZSn73o9GTwSRpdhnfXAy8q0he1uXXulXMA6p//+Z81ODion/70p9q9e7d++tOfamhoSDfccIOZ7QMAAIAbeKS9nce1bvNOrb/7Wa3b9LT2dh4nCAUAyF9cisfjyeBTwmB0ROHeIZsahZwDUM8884xuv/12tbW1qaamRm1tbfrmN7+p5557zsz2AQAAwAUifVHdef8LkwrG8qQaAFCIUJ1ffl/FuJ/5fRUKBapsahFyDkCdcsopeuutt8b97MCBA1q4cKHhjQIAAIC7JArGpir5J9UsKSwcxw5AkeprKnXNpYuSQSi/r0LXXLpI9bUUIbdLzkXIzznnHH3qU5/SqlWrNGvWLB06dEgPP/ywVq1apR//+MfJf3fxxReb0lAAAACUr1Bw9El1ahCqpJ9Ujy0pTGR1JW58OlqnUfx2Khw7wPk8o5mr4Z4hhYJ+1ddUOu/8jEsdrdO0Ye1ShXuHFApUjQafnNZOF8k5ALVnzx61trZqz549yZ/NmzdPu3fv1u7duyVJHo+HABQAAE5UChNFuFriSfXEoEOp3iykLilMbAO+//AJzZheo8ZgVUl+JqtkWo65Ye1Stk8HnKCUgsRxqb7Gd3LscFr7XCbnANS999475b/53e9+V1RjAACACUppogj3KrMn1YklhU2hal24dGFyJ6aHduzj/JtCtuWYBKAA+xEknmCqh3w8BEzKuQZULj7zmc8Y+ecAAIABKO6MkjH2pLq1KTB6E1PsBN3GOkKJJYUrFrdO2gac8y+7xLFLVdLLMYEyU5Y1+wo11Q6u7PA6jqEBqHjcpWE8wI0oDgqUDCaKcCWv9Ic3wrZN+hNLCr1ecf7licLBU2AOBpsRJD5pqlxWT3sAACAASURBVId8PAQcL+cleLnweBj9AFdgOQ9QUsquuDMwFY+0v6tPdz3won1LRMaWFM6YXqOHduwr/fPPyiUkZbYc0zAeqWdgWH85eELfffAl5mCwzcSafS2NtfrsRaePBtY9HlctMZtqyTBLisczNAAFwB1Y9w2UlnIr7gxMJdIX1d7Xj9o/6Y9LjcGq0j//7HjwROHg8ca+g/2He/TQjledNQejvo37pASJewaiOnZiSN/4/vOuDIpO9ZCPh4DjEYACkDci+UCJIZsATuaR3jrco0NHeg27eQ33DCkWlzMm/WVw/vHgyX6J72DV+9ucNQdzc1a82wNvY0FiSbrl35537fgw1UM+HgKOZ2gAihpQgDsQyQdKENkEcCKTbl5DQb9+s+dNXbayPVkA3O+r0FWXnGHPpL/Ezz8ePNkv9Ttw0hzMtcFJNwfeJnD9+DDVQ4YyeAhhpJyLkB89elS9vb2SpJGRET344IN66KGHFIvFkv9mz549xrcQgONQHBQAYASzirPW11Tqigs6tP3ZN7RqeZtWr2zXDZ88W+9eEHLtpL8YFBy2X+I7eHJXpy5b2e6YOZhtm1zYXIjdUYWlbT4WtowPTivEP9UOrnGpvtanUKBK4Z4hRfod0Gab5JwBdeWVV+r//J//o7/6q7/St7/9bf3Xf/2XKisrtXfvXt1www1mthGA0xDJBwAYwLQn52PXqXVXLB5/nYpN/auYjCUkNpi4vKv25Hfw2M7X9OFz36F5M+s0p7HW1u/Blqx4B2QfOSbrxwHHwvLxwQGfOWdj53FPf1THeoaSG2M4us0m88RzXDd39tln67nnnpPH49Hy5cu1detW1dbW6kMf+pD++7//2+x2mqK7u0exWGl/483NQXV1nbC7GSgh9Bnkiz6DfNFnkKvIwLDWbXp60s1r2S/fyYdT6swk2uGQB09lPc5kusGeP02RXud8B1nbauKNdaQ/qnWbd+Y9bhjZZ6Ycuyw6bws9FoZJCbD4qyrVOxA1vW9a+ZmL6jMp58aq5W3a9tTk3VDL9Vrn9XrU2FiX9rWcM6C8Xq+i0ahee+01BYNBzZ49W7FYLLksDwAAAMgHmTVTcNKT/hKvY1VKJi7vCgZ82n+4R/6qCjVNq1Zrc2D0+DvhO7AhK37K7CMLgj9Zxy5Zd97amomVYXxK9k+TOCb7bArjzmOPSqLNVsg5ALV8+XJ94QtfUDgc1oUXXihJevXVVzVz5kzTGgcgDac8CQUAoFhjN6/f+eK5OtTd45ysDodwbYFnl0u9wW4KVevCpQvHFdN33NIdi4OTWZf9ZQvaGilL4C3Sb915a+fGQHaNT47bDCnDvdnEQJmj2myjnIuQf/3rX9e5556riy++WFdeeaUk6dixY7r66qtNaxyACcYuqus279T6u5/Vuk1Pa2/ncdcWsQMAlIG4NGdGXebirS5mW4Fn2Cq1qPOKxa3J4JNkc7Frh8i2GY6lxcEzFJ628ry1c2Mgu8YnR22GlOXeLPU8dtrmAXbKOQOqqqpKl112meLxuI4dO6bp06dryZIlZrYNwAQ8CQVKEFmLAArkuCf9sETq8i6W7qSRJfvI1KBIjtdzS89bOzYGGjsOPl+FPeOTgzZDynpvVutLnsdHwgPa/uwbuuGTZysej7s62zfnAFQkEtGtt96qn//854pGo/L5fLrgggv0la98RaFQKOvvbtiwQU888YTeeustPfLII2pvb5ckvfbaa/ryl7+scDisUCikDRs2aMGCBUW9BpSzUlnzDGCMk+q3ACg51MhyqZQb7N7BEW379eTixa4PQmZY9mda8CeP67nl562VSyBTjkMw4NPqle3aOmF5qCXjk0Nq0k11b5Y1UObSMTznXfA+//nPq6KiQl/4whc0e/ZsHThwQHfeeaei0ag2b96c9Xd37dqlOXPm6GMf+5i2bNmSDEB9/OMf10c+8hGtWrVK27Zt04MPPqj/+I//KOq1fLALHkqNEbsF0WeQL/pM4WzfncYm9Bnkiz6ThcN2n3MKy/uMHdmsHqlnYFh/OXhCP9r+P1q2aK68XqljQYPmNddKMZPfvxRlCRQ1NxXeZ/K+npfpeTvxODSFqnX+2fP1jrnT1FTvL5vPmTDVOMNOrukZsgveM888o6efflrV1dWSpLa2Nn3zm9/UsmXLpvzds846a9LPuru79cc//lH33HOPJOlDH/qQbrnlFh09elTxeLyg1xoaGnL9OEBJ4kkoUFrIWgRQNIc86XellC3mj/UM6a4HXrQumzUlkNI6q04X/a936v9ue5ls2qmYtDwr7+t5mZ63E4/DkfCAtm7/H61fs8SVNfy4N8tfzgGoU045RW+99Zba2tqSPztw4IAWLlxY0BsfPHhQM2fOVEXFaCGuiooKzZgxQwcPHlQ8Hi/oNQJQKHsOWvMMYGrUbwGAEjKWcdQ7NKJ4PK7DR/v13Qdf0qrlbdr21L70dV5MepiQWlvmb9pnJoNPVr1/STMh+MP1fBTHYQLuzfKWcwDqnHPO0ac+9SmtWrVKs2bN0qFDh/Twww9r1apV+vGPf5z8dxdffLEpDTVDprSwUtPcHLS7CbBYc7G/T59BnugzhWmMxXXtR8/Ut3+4O/lk7NqPnqmFc6fL6y3v7SvpM8gXfQb5MrLPxGJx7dr7tg4f7dXA0IgGozE9tOPV0RvtDIXA+6Ijams15wH4oVe7Tr6nDe9frgrtM26+nqdy43HIpc8Ue2/mJjkHoPbs2aPW1lbt2bMn+bN58+Zp9+7d2r17tyTJ4/HkHIBqaWnR22+/rZGREVVUVGhkZESHDx9WS0uL4vF4Qa/lixpQcCP6DPJFnylO+5zgpCdj3d09djfLVPQZ5Is+g3wZ3Wci/VH9eX9YkvTQjle16v1tGoyOqClUrQWzgmmzPmp9Fcb327Flf/G4xr2nZe9fxortM268nqfjpuPAtakwhtSAuvfeew1rkCQ1Njaqo6NDP/vZz7Rq1Sr97Gc/U0dHR3IZXaGvAQDgKGVaBwIAykm4Z0ixsb2ZEoGelsZarVwyX/f+fK8uW9muH5m921eGHcae3NU5abex6y4/U5LU2dVrXVF0t0u9nttRlN4pmNegCFl3wYvH4/J4RlPpYrHM2yx4vd6sb3LrrbfqF7/4hY4cOaLp06crFArp0Ucf1b59+/TlL39ZkUhE9fX12rBhg0455RRJKvi1fJABBTeizyBf9Bnkiz6DfG/O6DNIyrHvGJ4BNTCsZ/7wtqTRDKhgwKc1/89fJ5caNYWq9cGlCzWzoVa+Sq+aQzWGBx2y7TA2I1StWCyucO+QGoJ+7T/cm3anN4IBmRnWZ7LstMfxLy9cmwqTLQMqawDqzDPPTC6ve9e73pUMRiUkAlR79+41sLnWIQAFN6LPIF/0GeSLPuNyXukPr4fz2jGMPgNJyRv7+x7fq2WL5srrlToWNGhec6004Vm44X3GI+07cEJHjvdrYGhEW7e/okvOe6fue/xPkkaDQRcuXTgpC8rIoENnV6/W3/3spJ+vX7NErU2B5H9PDFRJbP2eC6P6DMffPcwYZ9yQOVfwErxHH300+f9/9atfGdsqAAAAlBePtL+rLxl8ktixC7mL9EV13+N7tXLJ/HGBnqsuOUPvnh8y90YtLrXNDmpmQ416h0b01U+9Rx6drL20YnGrfrT9FQUDPq1a3Dba1w/3aN6MgOr8OVc1ySrXHcbCPUNpi5KHe4c4xyzA8UdByJyTJGVdO5co7D0yMqIvf/nLam5u1pw5cyb9DwAAICee0afHnV29igwMS+W5aY5rRfqi2vv60Yw3Z0A24Z4hLVs0Nxl8kkb7zl0PvKhIX3T0H42NIb9/tcv4MSQu1fkrNTPo15zpNZrdUKNrLl0kv69C8kjBgE8XLl2obU/t0/2/fEUP7XhVfzl4wrA21NdUnnw/aXytqRSJQFWqdIEqmIPjX6Jsnn9E+qLJ4JN08uFMcmxziZzC9RUVFXrzzTez1oECAADIiqd/ZW+0kHP6Hbu4OcNUQkG/vF5lzi6p9Vk7hsSljtZp2rB2qXoHR+T1eCYFx7774EvGZfelvF/qDmMTP1siUDXxOBheFN0qpbAsKaWNDdOqnXP8S+HYOYFZ8488jj+Zc6Nyzhf9/Oc/r/Xr1+vqq6/WrFmzxtWDmqoIOQAAQKanfyzNKh+hoF+/2fPmpB3DrrrkjNK9OYZl6msq1bGgIWMA05YxJLHjV61P4Zl15t9A5rLDWI6BqpJQCg8m0rTxusvPtO74Twh+xUZiyf+//+0eZx87hzBl7MjUd+dPU6R3clAq1yW25S7nANRXv/pVSdK2bduSPyv1IuQAAMA6PP0rf/U1lbrigg7d9/herVrelrWINDDO2E12VaVHn7/kDG2aUMS+vtanzsO99o0hcWlOU8A5N5C5BKpKQCk8mEjXxjv+c7c2rF16sji8icGnRJAjGPDpg0sXautYcH/1ylP10I5XrTt2JZxtZcb8I12/uO/xvbr8796VdhOOsstcLFDOAajrrrtOH/jAB8b9LB6P6xe/+IXhjQIAAOWHp38uMJaZse6KxeMzAwg+IZsJmQQtjbW64ZNnKx6Pj8susXsM4QZShgchHPVgIsNns7yNKe0I1PiS/W3V4rZk8EmSYvG4de0qhUy1LMwYO9L1i2WL5mbdhKNsMheLkHMAavPmzVqzZs2kn2/ZskWf+tSnDG0UAAAoP9y8uUSZZGbAOhMzCQ529+kb33/+ZCbHWB+yfQwpp6VvhTAhCJFTYMCKzJsMy+yaQ9Xy+SqsC3xOaMfqle0n39czuT6aVe0qhUy1bMwYO9L13aw17MbGMrdfH6cMQP32t7+VNLoT3jPPPKN4/ORRevPNNxUIBMxrHQAAKB9uv3kDkFbOGSYpY0hfdES1vgrrxxAX30CaEYSYMjBgUebNxM8WDPj0ZleP7vjP3QoGfFq9sj2ZfWRm4HNiOyZu6pD6/5/c1WlZuxyTqVZoMNKE+Ue6vputhh1GTRmA+spXviJJGhoa0g033JD8ucfjUXNzc7I2FAAAcCCn1Wxw8c0bgPTyWh4zNoa0tTaoq+sEY4iFTAlCTBEYsCrzZuJnW7G4NRnYGQyP6NGdr+nD575D75g7TTNC1YrF4uo83Gv4dXViO57c1Znc1GFiwOlEb1Rzm+sseahj9/JXScUHI42ef6TruwEfmd5TmDIA9eSTT0qSvvSlL+n22283vUEAAMAgJV6zAYA72L60zmmBeocyLQiRJTBgVebNpM82YbnbkfCAtm7/H9165d9q/+Fe066rE9txJDyg7c++oVuvPEe9A1E1BP36m3c2TQo4Gf5QZ+I5UVvEOWrQ+eXIZYAT+26MTO+p5FwDiuATAAAFsunmxpGTNQDOZ/WYZefyXAL1ObMjUGhV5s3Ez+b1eNK+r99XqTvvf96062q6Y3zFBR1qDFapsa4q5d+ZmEWc6ZyYX8A5auD55ZhlgFMh0zurnANQAACgADbe3JTMZM1qZDsAmdk1Ztl000agPg9WBwo9UtexfmvqHE34bA1Bv+bNqJt0HvQORM29rjqgVuJU50Q+56iR55cjlgGiaASgAAAwkZ03N0zW0iDbAcjKbQEZAvV5sjBQGOmLJouAr1reJnkkr8ejeTMC5rzvhM+WLhAU6R82/7pqcwaNkeeEkX/L9qW6MAQBKAAATGTKzU2OGTxM1iZz2801kC9bAjI2ZiU6KlBPduY4ib44GB7R/b96Jfnzd80Pqc5vwW1smkBQ3tdVs75TE/tK1nMiz/c19PxyQHYYikcACgAAExl+c5NPBg+TtUnIdgCyszwgY3NWomMC9eWQnWlwUGRiX2wKVevv33eKKiq8+svbPWqaVm19kC6f66pZ36nJfSXjORHwae8b+b2v4ecX9ZVKnicej7v2a+vu7lEsVtofv7k5OLoFLZAjQ/qMk57QOaktZYpxpkgGTxQj/VGt27xz0s2hkzJ4nNxnIgPDWrfpaUcfPzdycp9xHYsDIYWOaYb2mcRcwsZAfSmM7VmZ0W9S/mYw4NNHzn2HBoZGJtWDyvU9rB5nzPpOLekrac6JSF+B7+uA86tQXJsK4/V61NhYl/Y1MqAA5MdJT+ic1BYgE4OzkMjgKY5jsh0Ap7I4c9IRY5oDsioccRyKYMry5pS+2DM4opf3deuhHa+WzBJqs75TS/pKmnOi4Pd1wPkF5/Da3QAApSXTBCPSF3V1W4CsxiZfrU2B0QmYAUsSUuW8PMYz+uS0s6tXkYFhyVN4O0pWyg3N+jVLtGHtUoLWTkRftZeBY9ZUihrTykipH4dswYmijPXF4eiIYvG4Oe+RTRFjkVnfqV19pdT7KJyBABSAvJg2wSjxtgBWSWTwJCaB4zJ4shnLGFy3eafW3/2s1m16Wns7j7vzxt7Cm2sUgL5qHY/UMzisA+EB/eXtHluCfQWPaWWm1I+D2cGJUNAvr8djbQCkyLHIrO+0vrZSV11yhuV9pdT7KJyBGlDUgILLFNtnnFQ/xUltKWeMMw7jkXoGhhXpH9bA0LCagv6clsdYWV+EPoN8pfaZkq+FUyo80r4DJ/RmV0/BNXWMbEu+NWLKcpwp4Vo5ppdFMKC/5ttnDBmLjP5Ox47zfY/v1bJFc+X1Sh0LGjSvuVaKFfF383j/ku2jBShqnHFxnVpqQAHlyKZBzUn1U5zUFsASGSb49a3TpvzVUq8vYhgXTwhLBX3VGpG+qPYdiDijpg41YkaV8nEwsnZYhnG6bXZQMxtq9M7W6RoYHFZTfW4PYAplyFhk8HeaWn7i/l+9IsniAH0p91ErUac2IwJQQCmyc1BLLQg5EJXfV6negagi/cPO3goXKAPFFHm1fGt1J2JCWBLoq9YI9wxlramTcUwhiItMjAhOTDFO1/krVedPuYW1oD6Zk8YiAvSlwZSi/GWCGlBAKRkrhHjg2IC9xbfjUn2tT8d7ovrq936r9f/Xxhod1HKBixRT94zaDWxcUCroq9YoqKYO9bmcp5iC/bn8rsUbAjhpnHbiWEQh8NJAndrMyIACSkXKE6FV728z9ulHAU8zTYns81QVyKqop7FkDPLkuFTQVy1RX1Opttn1Wr2yXVu3v6JgwKfzz56veTPrJI9nNNAw4ZjzVN9hisnqzOV3bcgaNWycTjenzJcDxyLKT5QGJ2bPOQUBKMAueQZbJk76DBvUCpxcGH4jx9IYYEpFTzzLtXZDjuMpE8ISUq59dSpWPohJqanzroUNeru7T9998KWs12CCuM5STEAwl9+1I+BoyDidYU7Z2JC+KHJWThuLHBgUy4nLHjITKMyMABRgh5QLY+oTxzlNgYwDcuqk78ldnbpsZbt+NGEXkEIGtUInF0bfyPFUFchBqU48JfMmn3kErydOCFsaa/XZi04fTYn3eMp+QgyHm6ovm3EOjdXUicXiyeCTlPkaTBDXWYoJCObyu3YEHI24cc80p2ybG1JVOSwXdVpQbCpufMhcyvM1kxGAAmyQuDAGAz5duHThpEBSugE5ddJ3JDygx3a+pg+f+w69Y+60onYBKXRyYXRkn6eqQI5KbeIpmTr5zCt4PWEThWMnhvSN7z/vngkxHC1rX671mXoDl+s1mKf6zlJMQDCX37Ul4GjAjXum/nz0RL9m1Vcb3GBMxbUPmUtxvmYBipADNkhcGFcsbk0Gn6TshRYnFkI80RvVvBl1OmVWXVHFtwsuZpgyQVi/Zok2rF1a1ESYoopA+TKtqKxHOhIZzK/Q59iEsK7ap7seeNH8QrcWF/BFCZnQN8K9mYNAZhdmzvkabPC1v6Q54Nwupkh2Lr+b+m+aQtVavfJUXXPZopM1wsxS5AYzmfpzQ7DGwEZazAH9rVAU5EYqMqAAGyQvjB7lnvVjUipnUU8zDYzs81QVKF+mZDiOZVXtP9xT0BN6S7Iu3bjsALlJ0zdu+OTZGfuy2f01r2swT/Wdc24XMzfM5XfH/s3Gq96rvxw8MWWNMKfI1J9bmgLq7u6xu3n5c0p/K5Cjlu4atZTZZTWtjEQACrBB4sKY942TGZM+p6xRdko7ABjOjMln6lLmQmriWTEhdu2yA0wpXd/Y8pOXdNUlZyQz88b1ZY/H3P6aEmiI9A9rYHBYTdNMXqpUwjdwjjq3i5kb5vK7ceVcI8wxMswpvd4SShtK4aj+VgDHPGQ2KpBX4gFBuxGAAuwwdmGcNyOgWY21k54oWT4gO+FpZglPRAFkZ8bkM5ERMhge0WM7X9Oq5W2SRzr9HU2aPb16yr9rxYSY2nbIJF3fONjdp+nBqrQPYqy6gdt/uNeam6oSv4Fz27ldkp/XCXNbg5Tk8U/lkIfMRgXySj0gaDcCUIBdxnaeOX3hdNsHZNuV+EQUwBRMmHxO3Jjh/l+9Ir+vQkvfPdO45SdFctSyA4lAv4Nk6ht11b70N80W9Fcrb6pK/QbOcee2ydz2eZ2mLI6/AwKCRgXySj4gaDOKkAN2K7LQYjkwu7gqYJgSLgJqO4PHumKK75rVpqna2NJYqxs+ebbCvUPqGRqxti95pT+8Eda6zTt114Mv6pk/vK29+4/Tj21SUP81ub+aXig4ZfzMe/MAMxUwrhsy/pQQt31ep+H4G8OoDY/YOKk4ZEABsB1PElASyNRzFoek9GeV0saegaiOnRjSN77/vIIBnz64dKG2TqhbZVpf8kj7u/p01wMvKhjw6cKlCyfVzKIfW8yB/dfULIsJ4+fqlacW914e6a3DPTp0pLfoQsIFjesO/P4MNyFjsmN+mX9eJ3NDf7OAUUuZDfk7Ls5I9sTjcZd81Mm6u3sUi5X2x29uDqqr64TdzUAJmdRnHDAARgaGtW7T05MmoqWSil/uGGdGRfqjWrd5Z3n0U5PPe/rMZKn959Lz2rXtqX2W9aVIf1TP/PFtbd3+iuXvnSvX9xkzz8mJf7u2UpHeNO9lYpB94vjZFKouPAhrYDvLalw3Upk+cHH9OIOT42GOgbyMfSbPvzPxd8vx/Erl9XrU2FiX9jUyoAA3c8gA6JjdMYAsyiZTzyHnvat4pLe6+072H48s7UvhniHF4qM31la/N3Jg5jk54W+3NNbq0vPbteUnv0/7XmZlWUwcP4+EB/Toztf0tU8v0fDwSF7vZWT9qLIZ1w028RgHA77RnZurKtQ0rdpV2RooM0bVoiri75R6DbxiUQMKcDHH1F5KmfSuX7NEG9Yu5WYYjlMua/4dc967SKQvqv1v94zrP1b2pVDQr9/seVOXrWyX1+Mpi35cTsw8Jyf+7WWL5iaDT2nfy6Q6U+nGzxO9UdX5K/J+LyNrVZXLuG6YsXpYh471j8tWu3DpQj2041Xdes9zWrfpae3tPG5N7TjqLqIMmV5vz+EIQAEu5qgBkGLscDKP1HWsX6tXtpd8EVBHnfcuEe4Z0i+ff0OXrWxXS2OtaqoqtGbVuy3rS/U1lbrigg5tf/YN+X1eS98bGVhUkHvS+Z4lA85MRhZRNjJoRHHnFGPZcus279RfDkSSx2TF4tZkzTjJwocWKe1Zf/ez1ga+4ExlEpB0e+CbJXiAi5XFtq6ABSJ9Ud3xn7sVDPi0anmb5JG8Ho/mzQhkD5Y6oMbaRJz3Fhr7/n2+Cp3ojeq3vz+gD5/7Dt398B8UDPh0+d+dqgUt9YoOx9Q0rdq8doxlma67YrHCvUNqCPq18ar3KtI/rIHBYXPfG5NNWBb3iQ92mHZOZjrfLT//My3v0+gNZT5jpKHL9inunJSaLffkrk5dtrJdP9r+im3Ldt2+TAkTlFH5ALeXHiEABbhYSQ+ADryxR/lKZBEMhkd0/69eSf78XfNDqvNnuJQ6dLJU0ud9sawcN1K+/2DAp9Ur2zUYjenuh/+gweiIgvJpJBbXN77/vDX9I7VexYS2nX/2fM2bWac5TQHGUguk3lg3harl91Vo9cr2SQW5jTgnJ57vv9nzpj570V9PqgFlyfk/sWaKituB7jtfPFeHunuKDxoZVROmxKVmyx0JD+ixna9p1fI2nXZKg7bZELSkPleZKvA6XFYBSZcHvglAAW5WqgOgQ2/sLeHGwJsDPnMhWUOOnSyV6nlfLIvHjdTvfzA8okd3vqbL//e7kv0h07IWK/pHom3BgE8XLl2YbIerxlIbpd5Yr1jcqn9/dG/+2ZW5Sne+B3yOOP+LGiPj0pwZdaryxJP/jeJMvM4dCQ9o21P79L7TZ9ny0IJs3TJUxHXY0ICkA+aVbg58E4AC3K4EB0DH3tibzY2BN4d85kKyhtJNloIBn3oGR+wPIJbgeV8sq8eNdLt+HTrad/KGKmVZS1OoWisWt0oeqXdwxPQbu0TbVi1usy0I5mbjbqzH+kFe2ZX5mni+x5xx/pPhMkExN8UG3FBnus7VVVfa8tDC1dm6ZaqY67BhAUmHzCvdjAAUgJLj1kmrGwNvjvnMBWQNTZwsNYWq9cGlC3XL3c8y6bGB1eNGusnyb/a8qasuOUN3PfCipNHJ88QspG2/3md6v0gWQLWptovbpd5YSzbVZHIAMlxSFHNTbNQN9RTXOcuDlm7N1i1jxVyHjQpIOmZe6WLsggeg5Lh19wg37l7mqM+c506NE3dXOv/s+ckaL5KFOwlBkvXjRrrdta64oEPvXhDShrVLdXpbg6665Aydf/Z8y3eYSrTN6/G4ciy1TWIHp8O9mjezblw/sHwXNgfsJsUOdCdluinOZRwo5ncncdqOxE5rD4pS1HU4JSC5fs0SbVi7tKAHNY6aV7oUGVAASo5b07Ld+LS4pD/zhKe3wyNxsk1sZPm4kenpfcryp9kNtarxV1rfL8baNm9GQLMaa/Wj7f+jZYvmyuuVOhY0qD4w2k4YKEuWyuyGWmuzPIzImDGihgoZLknFZIa4NSscpafo67AB5QNKel5ZJghAASg9Lp20TrxwtzTW6rMXnT761MbjKcuC5CUfbEyZLEUGhpn02MmOcWOqyXJcappWbU+/iEt1/kqdfsp0VfzdwYjQ+AAAIABJREFUu3TXAy+6a2moxUVop1r2YeXypqKXoBhZQ8WF9ejSKeam2HE31E4o8AxncsD8veTnlWXAE4/HXXuou7t7FIuV9sdvbg6qq+uE3c1wlxK/sNJnStxY/+sZiOrYiSFLbhpt7zOJc67Ug40uKnxpe58pJVb0iyzXrUh/VOs275x082p1PQxL+4xV52LKcR+OxXXrPc9N+ifr1yxRa1PAwDedWmdXr9bf/WzBbXFln8nEqDmhE2pAGcFJbUnDEX0G9stjXkmfKYzX61FjY13a18iAAvLhtAtriQfDUICxp8WSdMu/Pe+OIorl8oTcAU/+4EBm94sprluJ5TupO/FJUs9A1JyxxAHXLUuK0E447qtXnuqYLJViM2ZY8jXG4EywgscBB11bKPCcgQPGPaQol3lliSIABeTBURdWJwXDuLBayyO91d1n2xbuKAKTHqRjYr+Y6roVCvrV0lirlUtOFkP3+yo0b0adZjfUGp4RlPG6ZaRs1ySPdCQyaHoAZeJx/+Xzb2j1yvbkRgR2LvsodgmK45Z82cTwOWEx44BDri0EJ9Nw0nwdcAACUEAenHRhdUwwjAurdRLL7wZHtP/tHtu2cB/Xlv6o/FWV6h2IqqG+WrGRGIFINyH4nJ6DjstU1636mkp99qLT9Y3vj8+ovOuBFw2/nmS7bjUb9SZTBLn2dh7X/sM9pgdQJh73I+EBPbrzNX3t00s0PDxSeJaKA4p/U0NllJPmhE5BcHIyx8zXncRB10hYzxEBqBUrVqiqqkp+v1+SdN1112nZsmV64YUXdOONN2pwcFBz5szRxo0b1djYKElZXwPM4qQLq1MmPlxYLZJyU7Xq/W16clenLlvZrqFoLO0W7qYe/7G23Pf43mTWRDDg0weXLpz0dJ9AZBmzKpulWFNNdI2eCDssKD/ldSsuxePW7NBoxfbX2a5JknTn/S8oGPDpspXt4zK+jA6gpDvuJ3qjqvNXqD5UPfqDAoJPjij+7aAlX3Zy0pzQKQhOTuaU+bpjOOwaCet57W5Awp133qlt27Zp27ZtWrZsmWKxmK6//nrdeOONeuKJJ3TWWWfpjjvukKSsrwFmSlxY/b4KSRp/YbVYYuKTyo6JjxU3FJh8U3WiN6rHdr6m5lCN5cc/0ZZli+Ymg0//+KF3a+vY/7/0vHaten+b9h/uUc/AsGntgL0y3ehH+qI2tyzF2ER33eadWn/3s1q36Wnt7TyerHM05esFcNpxyeW6Faqz5npixXUr2zUp8dqR8IAe2/maVi1v06Xnt+trn15i+M1P1uPuGS3k3dnVq8jAcM79zVF9ayyA1doUGL2JduGNo5PmhI6REpxcv2aJNqxd6vrAglPm607hqHEMtnBEBlQ6L7/8svx+v8466yxJ0urVq3Xeeefptttuy/oaYCoHPfVzylMmngBaI/WmKpH99KPtr+jtY32WH/9kWzxKLgHc//aJScsB/b4KzWqs1ekLp7t68lmuSiH4nJjoBgM+rVrcJnmk/Yd7NG9GQHX+SlMyOB31tHssu2tawKdbrzxHvQPRtNctq64n6d7nusvPlCT9/tUu1fori85Ay3pN8niSrx0JD+jJ33Xq/LPna2BwWJH+YWOXgWSaL6jwp/+O6ltw1JzQURxSj8opnDJfdwrGMTgmAHXdddcpHo9r8eLF+uIXv6iDBw9q9uzZydcbGhoUi8UUDoezvhYKhexoPtzEKRdWh0x8HHFhdcFa8tSbqsTT+w+f+w51LAhp3owz9J9P/EnLFs2V1yt1LGhQfcAnxcxtiySdf/boErxV729L/v/Um/nvPvgSyzHLVCkEn8M9Q1kDo2ZMhB1zXDIsc2htDkweH626noy9z8ar3qtI/7BGRkYU6Y1q3ead6YMxBYztWa9JUvK1dMuGr7v8TDWHqo27lqSZL0T6Cw96OqZv4SSnzAmN5oJ5lWUcMl93CsaxFC49zxwRgPrBD36glpYWDQ0N6etf/7puvvlmrVy50vT3bWysM/09rNDcHLS7CTBYLBbXwSO9OhrpV0N9jVqaAvJ60+foF1K41eg+09hQp7a5IR090a+GYPb2Gi0Wi+u3vz+ob/9wd/Im4tqPnqlz/rrFsjZYoTEW17UfPTP5OU/0RrWgpV6L2mcqFovL4/HoX1NuuIw+Bql9JtGWf3/0D7r0/FM1GB3Rk7s69bG/69BgdPJ27oOxGONUGZrYJxP9buHc6ZKccW0ainsyBka/88VzNaupLu1EeFZjnZqbC5sjZDsuVo5Jbx3uSRvo+M4Xz9WcGek/W7NOXn8Ohae+/hQiFovrlbExe9XyNm17al/aNrY0BQoe2ydek2Y21Orto306GunXO+aF9O1r36/jPYNa//89k3zvYMCnN7t6dMd/mnstOfRqV9qgZ190RG2tDdk/l0P6lpMUM87kM9dyk3KfV9l1bTJso4US56RxLNcxwIw+U+7nWTaOCEC1tLRIkqqqqnT55Zfrc5/7nD7+8Y/rwIEDyX9z9OhReb1ehUIhtbS0ZHwtH93dPYrFSjvM2NwcVFfXCbubASOZXJzPrD5T5ZFm1VdLiqu7u8fwv59JpD+aHLyl0Yn8t3+4W7Oml1/WTfuc4KQnaN3dPYr0R5PBp0Tw5/WDx9VY71djsKrofpOuz7TPCWrdFYvVN9ZHj4QHdOxEf8bt3JvrTrjiqY7bZOqTjrg2eaShoWHNnVGX9ob/UHePWpsDabNlqrzxotqf6bhY6dCR3oyfu8qT4WS0oDjsuDHbo4xtHBgYKmpsT16TPHHtfOnApM8UqKkc994rFrcms6EKeb9c1for0wY9a30VOfU5J/Qtp2huDqrryInCMggohJxROc+rHHFtgjPGsRzHALP6TDmfZ5Lk9XoyJvvYXoS8r69PJ06MfqnxeFyPPfaYOjo6dNppp2lgYEC7du2SJG3dulUXXHCBJGV9DSh1FOfLTynUoTFMhqKviWPQFKrWhUsXattT+7R1+yv66vd+W3RB5anaMmtadbII62M7X9cnPvjuSdkmdz3wYvn33wKLCpvRjp7BYR0ID+gvb/eY3xanFiIem1hef9fTOnysN3MBWLMK5jrguBRS+NbU68/YOXLoWP+k4Eu6Nho1tmf6TIFq3/j3zhAMM/paUnThagf0LaeIxeIFbyLAXCszV82rYA8HjGNFjQEGzPncfJ7ZngHV3d2tq6++WiMjI4rFYmpra9NNN90kr9er22+/XTfddJMGBwc1Z84cbdy4UZKyvgaUOorz5Ye15CePwYrFrZOCP8UWVJ7ShNoGIzFrtnN3FKc8SfdI+w6c0JtdPclMjpbGWn32otMVj8cVqnNPfYHUieVjO19PFu1PW6euTGu4FFKfr+jrT6Z6FinnyKr3tyXH7NQNFSa1MaVgeEIhY3umzzQYHR53fLwGvd+UqAdjmINHeguup8VcKzPmVXCDgscAg+Z8bj7PbA9AzZs3Tz/96U/TvnbmmWfqkUceyfs1wBQWFYpz84BUCEcUQbdClv6XOAb7D5+wZkKdpi2JG/jIwLDr+q8ZO6kV2o59ByJ6aMeryYy4lUvm6xvff96wAs+lInVimSjav2p5m06ZU69Z02vKb3xIp4BAR1HXnyyT8tRzJDXodCQ8oO3PvqEbPnm2Kiu9ClRVJNto1Nie6TPVVfs0u6E2eXwagn7Nm1FnzbWkTIOeVjsa6S/4mmf4XKuMxtOymldN+F4aS7z0CoxT6Bhg1JyvrM6zPNkegAJSLw4N06qleFyR/mENDA6raVq1My7iEybWZmYVuHlAKogbniZP9bRl7BjMmF6jh3bsMzf4M0Vb3Nh/HfEk3SMdiQwqFj+ZgZY1I67W54ysLZNMnFgeCQ9o21P7Tk4QzfyMhd6ImnEDm2ego5jzN9ukPJeAYHPTWJ2NxPsYNLZP9ZkSS9+ORgY1b2ZdeV9LykxDfU3BQSRDr1VOyYI1SrnMq9J8L9d+9Ey1zwmW3meB4QodAwyb85XLeVYATzwed8HHTI8i5A6QcnFonVWnv3/fKeo+PjBuW2QnXMQj/Se3iU7U2Zm4ZMDQNiZuREwYkEq+z7hQav9L8PsqJj9tMWkSnNpncmqLif3XiSIDw1q36empvx+zjH3v+w+PFvBMZEBdccG7dN/jf5r0z9evWaJQoCq3PlUg28cZu24IC31fJ93AFnj+dnb1av3dz076+fo1SxSq8095jpjaZzJ9Jicdd+StsbFO//3CW4V/fwZdq3K+RsNSfC9jyig7z3A5jAETr022z/lKhKOLkMPdEk9MgwGfPvK/2nXgSN+kXWicUBQyNdqdKavA0DY6oDgfnCPnQoVjT1M2XvVe3frZpbr+isWa0xywvi2J/jv23p2HbS7MbbKiiwoXKTGO/vL5N1Qf8Gn1yna1NNZq7syg6QWenWzejIC+9ukl+uqn3mNccfEpFFrU1FEFkQu8/mQrem73OZLpMznquCNvXq+nuE0EDJpruWE8LUV5fS9O2UjEaGNB9kIK9btCAWOA7dezMsASPNgaGU9cHFYtbtPrB48rFs+8C42dUeVxyzmy7JRD5BtmyHed+v7DvaY90c+5LW7KLLA5jToxjg6GR/STHa/qg0sX6rMXna4tP3lpUoHnqy45w9ACzwUz87qToe/Vt04z6A0yKzQ13xHLOIuUcTlDwKdIb1TTAj7deuU56h2IOiYzshyOu+s5oJ4WtTudifmKc2pUlhUXL50zChlQbjc26G6473f64+vH9MwfDmn/kT7LekbyialHisWV3IVGkppC1br0vHatXtmuQLXP1mh9umh3KlsnGmY+tZn4t702PSEq1ydTOcrnaYvZT/RzbYvrMgtszFpMzTw5Eh7Qvz+2V3tfP6qD3X3JWjuXnt+uVcvbND1YNa5Wly1P8Ex+Imtn38uWBWTG7zlKyqQ8mY0yf5r2vjH6XX/1/31GX/3eb9XbP+yYyXpZHHfY7v9v79zj26qufP87kmXJetiy/E78ShwMIQRMXAgkTQCXkAx5GEqBkAtDZ6AzLaW3hUsHprRDSkJpSqeUV0s/t7QUKIRCCPQWmiYlCXSgTfOAhIDzbB5O4vgZRZZlybJ07h/WOT6SJflIOo8ta33/giS2lvZee+21115rbcqIYJN483LPLbPY9lcU9ncpO08lqFIlIygDKsfx+IJ4aUMbFsyuG3NLPqPOqfqCGn29y4v3drVj6bypWL6gERu3HcOC2XXYtO0Y5jVVY+f+Lkyvd6GmzAqE1ZUpEUI5RygcQk35RXj6td36N1lW89YmTuP1m65uxLNvfKJ7T5X7VsxCmdOSO/XsKdy2qH6jL1MWyizQjniZJ9PrXTCbjOhx+/G7dw8AGHG+51xQMfJDOt7gqX0jq6fupdvUdMI074/JRvEMsH37PmHGndAXyojIDLUyYuPMy5TqYvT2eqP+GTP+igo+PWXnyYB6ZGkOBaByHLd3CPOaqsf0NHr6td2aOYg15Ta4nBZUlljx6qb9uGZ2nVg+oldgLIoEARAWHA01D3LS/lytzQ2oq3LgyVe1P0jEfkeHzYQT3V78+OVdEy5VOikyyww0cTZkyEJOj4bEO/zYTOMfrHUqXVHb2ddV99I9iE7QAywzB7tETNBxJ3SAgVLArETt8reYeTEYxqYUseKvqOHTT6gguxqBoglcfskyVIKX4zgdZhgMiXsaqUpk0X/76Q/wn898gFc37cdXv3ghpkwqRHA4nDAwpnVKbLwN4ccv7wIA7VIvE6Tkqpla6/YOwWEz4do5U/DW+4dx7HS/LnoS+x1bmmuZbFTPCqyUArAiR84Qmw4ejlMOxYhDpXbZk+66l25q/gRM6WeuxC3eXjoBx53IMnK4zQAL5W+67xkRVPHp45VGy/EFWNNJlUr3WdC/XIQyoHIZDjBwwLl1xWLkv9RpQUtzLQwGjPZdUskZi130Hb0+/OD57Vhz1xyA43BIEhgrdVqweM4UlLus6HT7AY7TLEVS9xvcJNF5NW9tnA4zrr6kLioImPSzVEphHfMdqQl8ctS80U9ljimzQBuSzQmjN/Kq38iS7jEDU7fvdNNNsIiSepmFpUS6+9gAM3uGaj691BeQoyMM2kq1Kj6Y0L8chAJQuYrEuNRW2nFn6wV4Y8tBseTNYTMB4FBTYcfkUpsqm1iyRV9bZhN7mDhsJnzxymkIDIXEEjAtjaHmqbmSzcFVZBn53omMrlVGmU2aFBbkoabCLn7u5h3Hx7yoJb0hUnSzihmD/31TE17a0IZ5TdWor3IwkSrNNGoEHtJxSBgNgOiOUocEBp1EWWjh7JPusXEYZeRgB9BrUASbKKaXWbofsFL+xsKeoXrAXqaOaGYrU9ij1AoUMaN/OYZx5cqVK/UWQi8GB4fAM2yU5WCzmeHzpZ6a6RkM4pHntyMQDKHPE8CRU2fx5cUz8NzvPxXLrl7ffBDvf3wSW3aeQEN1McqcFkVl54wGbNl5AqHw6CSYTUYsnVsPc54RRXYTpkx2osxpRSAYxuubD4oGIhTmsXNfF+Y3TR6T3q80ZpMBDdXF2LmvC6EwLxrs6lKrch8SSXXt7Q/gePcAfvCbHdh9qBv2gny0d3mx+2BP1D8PhXlcNrMKRdZ8lDktmN80GZfNrMLSufUjciXRa9k6wwEGowHvf3QSoTAPn38YJ7u9WDx3Kq67ogGtn58ifpZUnwT50p6fyAb5yPPbsftQN0x5eZhUakVjnQsv/rENHT1e3PiFRuw52KPefBBR2GxmnO4ZUG6OcxmJfr+7oz0j+6roulOYMXYmYuM63X5wRgPMJsOInbfm6y6rSBwZs5ZM9UzhsTCbxp/rpHuTAvJ0uv14d0d71J9J99KksKgbLMqkMaLOZPFYZKSXEpjcD2TMiyY+toR0z01akapPnwpydUQpnUxKintU0jNjBvotR/9Y1xlW4TgO1gT6QhlQOUpsJLnH7cfR0x4EgiG0NjeM6b2kRuR73Eh/GJhR50SBOQ8H2t36pUiqfYMruZFond+At94/LM7D2k0H0HpFQ/LovIrZLi9taBOznhw2E66+pA41FXaUFlmibiqUvJmQNj+/ds4UvLrpgDguDpsJl8+chDe2HETr/AYYDND9dcRcgdKUUyTBzZ6SN4u6z4nc20utbuYzyfjJ0uyBRGSkZ6yNhULypH3TbQA+Peoe8/KtrrrB4Bzplm3H0likMQ5KZWDovh/EIndeGMqSZAIVM7Hk6ogWWUGp7lGqZYeR/ukCBaBylHjGxcBxI1FkLrr3UktzLcABA4GQsotSzqLngdIiCw6d4PRNkVR6Q5A4KbYC06gRlvY3ivx30tI3pQ1kRC5vYLTs750Pj2D5gkbYrSb88q1P4zoSSm5WwgYZFQiNjIX0z6RPy0/oEop0HXuFDwSUppyE2LG25qHtWHzHW8lDgjAnDptJtNMGjoPLYVby28UnhUOfJun8GR5CJ1p5ViZ6xtpYKCVPWgcYDmjv9onBp0w+X0mYmiOdA0DMjEWa46DUwZq1PTqleWGg/C0XkKsjWvTuS3mPUjNQRPqnORSAylHiGZeGSYX43zc1ob3LKx5qhAyUQDCEt947rLxTIWPRFxbkoWFSIZYvaBRfP8v2Z0SlTsryBY1jNgPh/80mI3rcfrzz4REx46f53HKUOPJVCT6JmVhXNIgy9Lj9GAyE4r48JzgSSm5W4qtJMc3G4/2ZIMuEzcJJ17FX4UCQ9hyz0IdGTSRjLWQJTq8vTuh4Z3RIiBPoum/FLJzo9kbZxppyu/y5TnN+UjlcaHEzL5VHuDhp7+pHeXGBLHvJXPZAhmSiZ6yNhWLypHGA8fiCaDvax9R4AAzNEQf09g/pGgBiZSzSDoQpdLBmquE/2JkXYhTZOqJBVlBaexQFiiYMFIDKYWrKbfjeHbPhHxpGqcMsNpOurbSjqtSKjh6fJqV448IDDZMcqHAV4JzaYvgDwygtNGdn8AljnZQwPxp0kmY7bd5xHMsXNGLjtmOY11QtlpuVFOarUm4WK1dKL88puFkJG6QQCJWOy1AwzNQNn9qk69CqciOczhyzVBqhNDHZgtKAfVgSwBUQ1kttmS3tQF68sZxcZsWPX96V3lxnMD/jHS7CYR6ewUiWp9Wk+roV5Cl1WqIuTtZvlXdxolj2ACMB10wOo6pkUmQwLorKk+IBxu0dEvdoXTINE8BEtkvEfrR39esaaGBiLJBhwEWJgzVjpUSszAshIRUdUTnYw1rAlNAWCkDlIgkOHYW1RQCA46e9WLtxP77U0hh1mwxu5Me9/qD2txc8YDfnwW7JE51YcJw2zr3CB4pYJ0UadOpx+7Fp2zF858uXgOd5lBRZUOos0KT3hFSu2LI/oTwzqSOh1GYV2SBrym2oLLHi5+v2iOPy9RsvQmXJhfj5uj05sWGl69CqdvOY4hwrGghj5GAvyBKbLRjbOy/heknzkJBoLL99a7MupVZJDxcc8NdPOvD4K7vgsJmw9PNT8dUvzsSzb3yi2roV5Glprk3r4kQRZ1iJgKtSep7BYVTxg0GG46LnQcXpMOMvH53A7YunIzAUSj/TUGFYOLwJ9mPcPpUqw8JYAIwEXFL1w1TcV1mZFyIGVrKIGAuYEtpCAagcJNmhA4D4d51nfKgqsWLB7Lqo/kM15XZMcin3KoNs9MimUOEzY50UIbiy+t8vx4A/GGWEPYNBzXpPSOUSyv6uv3IaplUXodxpQU25XR1HIoEDZDfn4cIpxWM2p0nFBTmzYaXr0DLhCEPBQBhjmVTxsgWlWYLj9m1LwwFMNJYF5jxdSq2SHS48viAef2UXaivtWHBpnfi66vVXTkNNhR2TS6yKr9vRzMk0szEUcIYzDrgqrefpHjQUPhhkPC46HlQKC/Jw66Lp6DzDSEa4AAOHN8F+aNqnMh4MjAWQhQEXtfdVRuaFYBhWgmEswNIlqwZQACoHSXboAB99iLpz2Uw8/kp0ecfTr+3WxenStNFknGbcSn1mPCfl1kXTUeLIR4l99GU7QIMaeonBcxVZouTqHwiiptyOqZV29RyJ8RygBJtTrmxY6Tq0rDjCSgXCmGkyG0Fa7lWQb8Qdy2agzxNIGMBVomQ40Vg69Cq1SmIT3N4hOGwm3HBVo7h/BNwhrN20X3w0QHE9jMhTXlyA9VsPp/2dMrEtmdprpvRcwYNB0nGJBCzd3iEM8RzyDQk+S6+DSkSvzPlGdffiNGXTcy8U7IdmfSqTIYxFRJ+Odw1of4iL6Mpjd8+FZ3B4pF1EUfwn5VVHxmFWE3tDAQaCGB/GLlm1gAJQOUjy0gku6hB1QnKbrOqLeDLQrKFhgmbcin1mCsEcVTNZ4hi8+1bMSiyXCo4EUwcuFkk38Cf5Oa8/CLMpDwP+IDyDw5o65EoFwlhrZup0mMXs0LWbDsQtM4sN4GY65onG0m7J06/UKoFNcDrMuPqSOhztOKvtvPFAiSNf13KtTOw1a3quFInGxeUwZ4fTzY+8xstCVilLSO1Hj9uPt94f6bemefBJINVDnEoZB+1dA/rqtMxxmKj2hiCyjVw8CxlXrly5Um8h9GJwcAg8S05OGthsZvh8Qyn9jNlkQEN1MXbu60IozIubU3WpdczfnT+lBAfb3SguNOPaOVOwbssh7DnUgw92n0JDdTHKnNrd7nBGA7bsPIFQeHTSzCYjls6tHymBUQjPYBCPPL8dgWAIM6aW4FC7O/FnciP/vtPtB2c0wGwyyP4cs8mIImv+qOxxfleyuUoXQWek3xMAQmEef/+sEwsuqUF5xNlWm063H+/uaI/6s1CYx2Uzq1BkzV3HPpYxuiL35/KNONHtw6Mv7MC7O9qxZeeJtNatzWaGb3AoLV0vc1owv2kyLptZhaVz60d0N0W7q9Xal4vZZEBjnUsMNvn8w/j4QDeOn/bgwS9fiisunoRrLq0DOIAzpGYXkpFsLNPVESXmJxazyYBgGDjTH0huP1VCje8kh0zttVTPS50WLJk7FTOnlaCushBWs/Z6rhSJxsVRYBqzB+3c14X5TZN1WdfJUGMvngjotdYEpD5wPJ8moT5FgjSPPL89o70xlpRkUAm5MrC2r2pFOucmIrdRW2cm6lmI4zhYE8hPGVC5yDhZFdK/cznMqCm3o73Lq3v/A63KipI14476TCiYMpnkxkqtGnoWbr9Y6VU0UVHqViUc5tPXdQUy53QvKYxzU87z/Jj109HrQyA4jLPeIJ783fbxxyrVG3g1yhlU+p1Tqgrx4jufjbGfd994kfrzlsl3yiQrIsNSZUHPX9rQFtV7Ue5LfsySYFyOdw3ovgclJI4eZGU/G7X7ijBUYpWKT6NWxgELfpVcGXTfVwmCAJCbZyEKQOUi4zkkMQ4FM/0P1G5oyAFe/zDyJU19k/Vy8Qwq58CM5wyp4eAlK4sQnk9Xu4eCGq8t5VITv/FQyhnu6BnQLz04MqdFNlPcRv2qkyA4PLnMFnf9mE15YvAJSDJWE7zmf1KZHbcumo6XNrSJvWGm17tQU2YFwnpLlwAl5iSTA3lkj7v3lln47i/+CofNhNbmBoAD2ru8qCm3wW7OUrctzrgw63RH9OClDW2Y11QdpbusBFtkMcFtTCyp6JNagSIWdFq2DNQkPPdg1UdmVa54qCBrLgaDs9STIdImHYdEjf4H6S5gtW7bOODwqX6c6PZi47ZjUbf2iXq5KOnA6HFrFs/g3bdilrb9C5I5QKnqSI4523JQyhnu8wzqE4BOMKe1ZTbN5jRRcPixu+fGdRgG/EFZYzXRa/4NBg7Ta4tw/63N0Wub1eATGJkTHhgYDMJhM+HaOVOisscqS6y4cErx+LqfJc48q063xxcck4UmZO/NqHMyOZbxYEKfNSQVfVIrUMSCTqckA0MZbITKsOojsyTXeHunWrLmYDCYAlA5RroOiaKbKkvGJoLHF8ThUx6s33oIgWAI73x4BMsXNKLCZYUpz4AyZ8GYn1HSgdHl1iyOwQOA+3/2YWYOqxJlRWloahHiAAAgAElEQVToSK4523JQat26Cgt0udVlYU4TBYf7+gNxHQbP4LCssWKhVEN1suxww8qcCE3cY8vef75uz/i6z+D+mpCYPaiyxI58A5+5nBkG4NzeIcxrqh4z/nq9AJwurOizZqRwiFMtUMTCQZIFGQjmYMGfYlquZHunFrJmmb+UKRSAyjHSdkgU3NCYMTYS3N4hhGN6uvAAnng1sROvpAOj261ZjME73p1hTw6FDj/p6MiEdLYzzWRQaN1Wldp00U8W5jRpcDiOwyB3LbNQqpE2WZJhkyqszElhQR5qKuxp6T6L+2tSJGuorMyO7u7+zH5fJntQRK9NJiMMBuhuezKFFX0GoJ3NkHuIUzNIw8JBkgUZCKZgwZ+KBytyJds7yxiTdSJAAagcIyOHRKENjcUF7HSYYeA4cWxammvHb7qupAOjxO9SwMHL1GFV6vCTjo4w5WwrgVKZDAqsW6GcSusbVRbmNOXgsMy1zEKpRlqMp5cRO3T6UDes5jx1g1MKH2qlTcClvX8KbRqXDvLA5NL4PcbG033F99csCzamvQdJ9NphM+ErrTN1tz2ZwoyNYTUrT+0gjdy1k2VrjMgQneabBX8qHqzIlWzvFGBF1omAceXKlSv1FkIvBgeHwGe5kU/1aUgWnhJm8enXkWfSOdRWOrDv6BmcV+/CnkM9Uf8m0ZOY6T5/Hl+ONH9XCk8KJ9OZTPVDqadE09ERFnRbSVh4zlnAZjPDNzCUma5zI9+p0+0HZzRE1lxyMp7TND4zHuk8NS5nrPR+wjwdkuplvlG0Qxu3HVfsafO4qPSMelmxBZWldrz4xzbsPtiDv37Sod53SEK6uq/o/qrSGCdiXH9GxnpOdw+S6rXPP4zOvgHc+IVG7DnYk/5+opD9yQQWbIyae5naz6Onjdy1o/EaI3TWGR3nm1UfmRW5ku2drqIC+HxDzMiaLXAcB2uCfZfj+WwPwaRPb68X4XB2f/2yMkfqKetC9F2v2nBWb8Mir+B5BocBnseqX/19TJSb1TIGz2AwqncTkFjecXUmA/3w+Idx/zMfZD5u6eqI3rqtIMe7B7DyuW1j/nzlnbNRW2rTVJa07IwUBcpiUp5TVu1MlpNML522fNl2KFNSsXks/N60SEf3FdR7rcciqZ2R+b3S3YPi6XWp04JvLZ+F4eFQ6vsJ2R8RNfeyjPcmlZC7dpiyNzmCnjqj+3yz6iOzIFcSm11WKtEZFmTNEgwGDiUl9rh/RyV4uURM2qf4kpTWC4fVBok8YDfnjTxzzYGN1HWZKFp2kUFaumIp/+nqyATqezCRUn0zKs1Mc06zrheO3sgsC0iml1qWV6v1WWn/XjXKKtLRfQX3V5bK5eWu53T3oHh63T8QhN1sRKGQnaByH8OJykTay+Qid+2wtMYI9dF9vln1kVmQS+7eyYKsEwAKQOUKrN3GZbqA1a6hZjVIlgBmHDyF+2LlspFX+uVJPXtM6OF06e7oZRMp7A9J9VLSR09ALTukls1L6/dOtP01AjP7ClJYz2nuQUr3SyL7Mwozvag0RO7aYWmNTXi07E+YAKbmm3qPjSXHzx1aQj2gsly55NYys9RPJuO+CGrUUMeTiZfZk4mBPg+p1CVrUf+uZF+sXEaR/h0KrJeCgnx09g2kreNK96WRs95Y7DXHKqnuD4n0Usv+CGp9Vjq/l6n9VUG07neRbG9KdT2nswcp2S+J7E80avWiYrUHlNy1kxM9ZRjwkaV+kOr9CZPAzHxT77HExNFXm5VNO8M61AMqAbnUA0qxGvxMI+YK3BQrXkOdYX8aZm6+JT2s/IFhlBZZ4s6PJvXvdLPCDBmvFw44cLIfj7+yK30dV2qdpPJ7WFqbjJPx/iBZ764iC8KhMHzBEKwmo7pZDmr1Ykjx96rer01Pe5rKWGQopxI9oDRBzvfUSt4c32tZ7QEFQP7amcg9ZRhZt7r3XpLCwHwzNR4skUBfP980Gb29Xr2lyzqoBxShTNpnZGHGPlFdU2aV/US1En0RlE5tz0Qm1vo8tHcN6L7Rs+JwECNkul48vqAYfBJ+NmUdV6g0M6X1lmVltHqS0f4wnsOm5nirlS6f4u9VtaxCb3sqdyzUlpOV9Sz3e2ohr966QSRH7tqZwGU/rPjITJXEMjDfTI2HHDQKtCfS14ZqJ/I55T8vl9EhD5LQA6EGX0j9jqrBl4nHF8RLG9qwYHYd3nr/MNZuOoAfPL8dnx51AzIXZjKjJxfB2S91WnDTFxpx09WNWL7gXLgcZtm/QymZlPg+SpHIcHp8wZyUgxhBWC8AxDWzfEEjbBaTrHWrmI7zQKHVJDar9gwOy7YbacsScfRqS20jTtUEcuyVJJP9IdF67+gZUFVmTYik4h/vHoDHn1hfldhfE8GEPZUxDprIycB6Tul7qiwvE7pBEElgxUeW+kECudxri8nxSLTPRALt9//sQ6x8bhvuf+YDtB0/m7L/KIdE+trXP6j8h+U4lAGVKyhwG+f2DmFeUzVe3XQAgWAIpU4LWpprcbLbi0qXFSWOfE1uigsL8nDfilk40e3F2ogsZpMRNeX21J51j0TSM5GJpYaCrNxosCIHMYJwOBaCx5u2HcO8pmrs3N8lK4NRMR1X4LaepfWmGCyU0GSwPyRz2CoLs7ifRCr6qmK2i+72VOY46C6nRrD0PVmShSDiwcqenYuN8JPB3Hgk2We0zKJLpK8uRwFyUlFUhDKgcgXpIcduTsvIOB1mGAwQg0/XzpkiZkJ99xd/lRWRVuSmmB9paCkEn4AUbv7iRNK7zwymLZOaN9+pwsqNhu5yyMxayBkih+N7b5mFTduOpZzBWFiQh3tumZWxjitxW5/SessGPUjlZk/t75Nmtkai9T7isGUvKeurStkuettTueOgt5xawdL3ZEkWpsgG258jMOMjSy4JfnDXHKy5a05ul6pKxmPlnbN1H49k+4yWWXSJ9LVKiV6ORBSUAZULKNQnoLAgD9PrXTCbjGhprhUzoYAUItIK3RSne/MXz8j9+OVdeOzuuenJlOr3UTHbgZUbDV3lmCg9MZTWEx4YGAxGZTACI/r/9Gu7k69bHrh8ZhUqi/VZs7GyyFpvWaIHsm/2GP4+idZ7Vaktq5t2spJdorddlzsOesupFSx9T5ZkYQY5tpKFrFMtYOF7stK7LSJLYYEJDbWukcb1E3HOU4GBXlQCyfYZTbPoEuirwUBRbKWhAFQOoFj6Ig/UlFlx940X4WS3N33nXAGjl65BSlwuEhi9uU5VpkhvG+H3g+P0eRWHlY1eRzlYaXiZESrpiTSDUYqcdWswcLqt2THIsB/ZogdyD/dMf58J6rCxUjoijO9jd8+NeuFUK2SPAyv7j9qw9D1ZkoURxrWVWgfz9QoCsXRpwVCgg4iBhSAlku8zmgfak+krI+M1EaAAVA6g6E1uGJhR50Sly4r1Ww/DYTOhpbkW4AADx6XdCDxV0jVIqhwqZG70mhwiWdnodZKDlayFTFBLT6QZjHocqrV0IrJFD+TaI+a/Dyt2R0FYyy6R/cKpwg5ySuMwAfUgLnIvnbSSJRfGXCaCrRR6lArld15/EIUFJm2D+ToGgZi+tCDYgKEg5Xj7DBOBdobGayJAAagcQPGgCw+UOPIzawSeKekYJA7oPjOI5Qsao2TO9FAhd6Nn/hCpBwoflpjJWsgA1fREksH49Gu75el/ZH5OH+qG1ZyX2fxo6EQoqgcMlM1qptfZdLuntqysOL3QuVSToXFgBjqIxIcB++F0mFFVYsWC2XViubngm05yWTX1w+RkY6k1XuRvMg4Da4WpIOV4+4zWgfaY+SkJ82yN1wSAAlA5gCo3uZFG4D9+eZd+izFFg+TxBfHjl3fBYTOhdX4DwAE2Sx7KXQU43jWQ9iYgd6NnKjjCwOanhhPPWtZCOqiqJ5EMRlmHSZUOs7IzBzLQUcX0gJGyWU30Wu53naC2Iy6MZLroXqpJWTZR0EEkDowE5QoL8vDVL16IHzy/PWp+hF6HWvphSdet1aTqeDHlbxLRMLJWmAtSCvuMdSRTMZNzWUbEmZ97bpkFS76BrfHKcowrV65cqbcQejE4OAQ+Wx2pyCsfJ7u9CAMwm5I/aFjmtGB+02RcNrMKS+fWo7rUmvGC7nT78e6O9qg/C4V5XDazCkXWBJtcRO5Otx+c0TCu3EoiyOvzD+PTI73o7BvAtGonnvrdbry7ox1bdp5AQ3Uxypyp9dfgjAZs2XkCofDogJpNRiydWx/1Qo3ZZEBDdTF27utCKMyLm051qVWx7yiHgoJ87NrfjUee357R984Uz2AQj0icxFCYx859XZjfNHnMyz6poIauy0Ih3dZCT8z5RnAcB7d3CJwhvqyqzE9kYx9X9+T+uyRI9aB13hQ4CkzoPJPa3Kilo7HImQ+19VrWd01hXmw2M3w+5V+pkS2rEiigh4qIIXOPSWtPZghZOqOjDyHA3DgzMCaarckY4unMWd9QwvkpL7Jo5oclW7eBYFjV8WLF32QODnB7gzjZM5BzayUWufuKpjCw58abn+2fdWLRZXV4/6OTbI0X43AcB2uCPZEyoLIRSXTWYTPh6kvqUFNhx+RSW+JIsQo3mCnfsOgc9Y+VN+2X/GKQnZ2gZSlDkiyFjp4BJm5v1Sw10zpV1+sfxj86+vHzdXsy12219UTmOlRjfuRmDqSdYRCr91ZTRjfNmtwQyrWLKuu1nO/KSuaHVje3qZS+qZkVxlyppl4wkjnA1DgzMiYsZVM47UnmR0M/LNm6Pd41oO54UensWLJhrUSyf7TIMGaxYiDenvvShjbce8ssDAwGNcmISjQ/geAwc+OVzVAAKgsRFqjDZsK1c6ZE1blraUxTNV56H15i5U33RbAxpLLRaxEcGWeT7fMMaucoJjmYqe7Ep3sojASUpC9OJSsTazt+Fu1dXqzfekg53VZRT+SuQzXmR+4hJaXDTGSeB4eG0X02gGdi+ltNLrWm7dBocdDU2y4KyPmuih8y01yjWgUApN9X2tR4IBAatfFaHGoke4zXH4TZlIcBfxCeweGoMWPxQKEkrKwVlsaZlTFhKSgnnR/pJS04bqQpuVaXVEnWrSbjRaWzUSRdKxoGfhLNvcth1jZAxmCQMvYRgQKLETaLCd/9xV81O+cmmh+7xYRJLitT45XNUAleFiqOkP69ZO5UrNsyeujVI41TKAv5fNMkXHNpHcAhYQkJC2nr0jKWuspCRdMpzSYjiqz5uqdijpfem282YdPfj6ufRjpOKq2qKeLppvFywOFT/dh/3I2fvb4HW3aeSPqzwlifW1eMPYd6ov6O1dIXuetQjfmRpnyXOi1YMncqZk4rQV1lIaxmY9x/NypPHB2NzPPP39iD8+pL8eSrH4/R+4vPLceWnSfEHyl1WrDo8npZpbdalDGwYBcBed81lZT9ccupMki116q8RPi+xYVmXDtnCtZtOYQ9h3rwwe5Toqwen3blFOZ8I050+/DoCzvijxkHFJjzcNnMSZhzYRVaPz9F2VJNlUu9xtMZVtYKoGOpdwysjIl0TRYXmrFsXgOumV2LQlu+qmVOiXSmzGnBlbOqUVNZiBf/2Ib3Pz6pWwmtsG5/8eYnsFrycbLHC2ehGU3nlOVOiVwi26Fh+WiitfL5pkk40e3TrOwr0f7lKDDpUpoX9+yiU1kvZzRg98FuLLq8Huu2HELDZCde2bhf0zGJZ8sWXlaHIrsZ/qFQVIY9BZ+SQyV4EwwhOgtORgaPBi8FFVpNONnjw5O/2540Qs3EDZn0RsiA1F4EkwMDDXrHy1KoKrVpcns77s2sircv0izB1uaRhvPtXV7UlNtgNyc2ex5fEIdPebB+66HEPyuZ4+EwL34/xXRbZR2SvQ4l8+MLhmA1GTOeH+Fm+qUNbVGvFK3fejjKZsjNMBDmuXV+A452nI26OROe37YW5MFsMsJhM2HxnCmYOrloTIPahBkDGtwQMmEXAVnfddx5kejuEM8h34CEY5VR5oZGN7fC923v8iYs12bmVa0EpaaFtUXKfDgD5SvMrBWAmeySpGOipT8SWZOP3T13TDn6fStmocxp0dYv4oFwmBflAPQp5wFG5iB2z6vaZcXXv3QRvnfHbPiHhlGq9YFWS91IZDvqitB2TDubkmitmE154vkF0CCLMEFmXI8nwEYZq462XvqIgMNmQoXLqv2YxLFlgv8Y+4J6zr98mgEUgMpCpE6xcLASDlwGjoPLYR75hxoZEVmHfQ4wcMDXbrhwTJ8czTZdyYbrKrKgvdOLl/+0D63zG2AwANPrXagpswLh9H+/3g46ML6TbjBwmhzeZB3MVHLi3d6huCWqlSVWXDilOOHnuL1DCPN81M8KKfzHuwZQW2FHe6cXL21ow7ymajRUF8FsMmLzjuO4eUHjmHJY5l5dg7zyBEEWYb1UltqRb+AzlyGysd97yywxpRpIPzgp6hgHhHnEfX67YVIh7lsxCz1nB+EfCqHtaF/cQJXXH4zv0Kh80GSpnGfcV2iSzUuKuptx4EaLAEDk+5rzjQllZeVVLQCqlmKxUOo1ru2CDhdA8Q7yGsqR0H7YTJoe7gHEDfo4bCac6PaKLyZr6RexUM4jyDGvqVrcl0qdFiyYXYdVv/q7OsHiWMb0RszTVDektkOYi/aufpQ4LZralET24+yADv3L4lzeL19wrj4B9hj9AM/rZ+t5gJf44N3uQX3GRGLLHDYT/mXJjKgMe71KnScSFIDKRiJOcU25DXVVdpzu8UVFZWvK7ZheW6SZwzjuYR/RTdOvv3LaSNP0EqumwSfp4Wj5gnPFnj2/e/cAgBGjlsnYpJt1ozTSLJN5TdVicK3QZhoNrmlweNPzttrpMOPqS+rGZC38fN2exHPMAbYCEwwcJ/6soNebth3DvKZqmPIM4k3mpm3H4HTkY/mCRqzddADvfHgEKxaei/qqQgSHwygtSj19W5M1m+SmWnRCoWIgjAcGBoOKBCfFbFAAf/noBG5fPAOPv7Iravx+8spHWP3Vy8XMttYrGuIGqmrK7Zjk0qGMhrU+DOMFkhLMS6q6K7UPpU4LFs+ZggqXFcHhMDz+YW2CB3LggdIiS0JbpmUAURizeJdOfSrfnsce5scEbrXIqBgny2ZoOKxtwCXOWtFcjgTZFL2eIV0OkbFB0pbmWtE/1VIOYGS9SG196/wGrN2ovSxOhzmq56hSD+DIIo6OfufLl2iqG1LbIT0nCJ8tRdXAT2StPPl/rkTb0T7RfugV+IndM/+8/ZjoT2p2GRVvv7+5SddMLKfdHOWDK3K5mwbSs217Z782Y8JAFY1WUAAqW+EBuzkPdZVF+Okr8TcSt0ZR/fEO+8DozWzAHcLaTfvFYI8iC0vGgo019GGeV3xs5ATiNDEsPDC9rggrFp43prxwem0RwmEenkH15Rg3EKaioS0syENNhV1+pktkE35pQxuWzpuKArMJgWAIrc0N2LTtmBhwqiq1iTeZrfMb8Ju320YCjvMbUGAxwmoxianDsl6njEGzcp6Y253YgGk4rO4N2LjBSTm6wQHdZwaxfEEjNkbm6GR3fCeh96xfXPObdxzHnctmjglUPf3abs1fNxNJlnkEbR2SRIGkx+6ei3CYTyhHqrortQ9L501FYCiEJyI3jFUlVnz1ixeC53k47fo7YQltmd0EjzeIIpsJq//9cgz4g6q/qnXfilk40e2NOqQ0TCpUPeAfe5iPCtyWWLXLqEhQWiUEmLUMLsRbK3rIET+bolGXQ+QYPZTTKkIlpOU8gWBIN1kKC/Iwvd4lBo+1LCuKdzHae9av6TgIOhEbeAvzCrYukAsP8Dyi7IcugR8kCOpzwEN3zsZQMKTJZVRsdtriOVNgLzDpWuos9d8D7hDe+fAIWueP6O6F00oxqdiiiT8gPdu2XtGg/pgwUkWjFRSAynLc/aMbidSI+YfD4DhOEyMS77BfYDGivNiKTvcg8owG9Ta7DJ6UV3pskgXiHrt7Ltq7BrRLeR4IisEnQQ7hEHngkw7x8K32QSFhIEzt2n8emFxqizowCUGhswNB2AuiMyykm/AbWw/hK60zxT5r0oBTZ59v9CYz4swG3CNZdDd9oRFrN+4dkzXV3tkvu7xT63KeeKWGp/sGwansqCcNTvIy1jQH9PYP4ccv7xKzQTgOaJjsjDt+JYUWHO3oh9lkRI/bjxNd0YEqwW6dPjMIcJwY+NHUGWAhmwLxneICsxFHTvfjZ6/HyZaLyJGy7krKMXfu7xYP7efUFGHBpXXioXHMZ+lxQxhjyxw2EwrMeejz+PHsG59EyVlbZlO1xKnMaRH1vrW5AcWF+fD6h/GrP3yq6k3xmMM8RgO3q//9ct0yKgQdLXMWaB5cYEUOYGwwLN9k0OUQGVvmVF9VqN9hNlLOI3y2JV+fMQEP1JRZce8tF+NUz4CmZUXSfV7wSWwaBxhG24ZE77sf7e/EHctm4Lnff6pp4CfeS9AA8OC/XIKh4bBm/bikQX1hbgJDIQwFw6gtt434ixqWqX7xymkIDIXw7Bt7dMs6AiD674KO9rj9+N27B2A2GTFnRoVmwRjp2VaxNhtJYKHMXUvoFbwsjyoKL5oJr/Rs2dmOusqRDf9X/+9TLJvfgH1Hz4gvLdx940WoKbMpLoc5P098uWDLznZMnezEr37/Kfb+oxcXN5Zj+2edqry6JrxC5rCZcGPLOWi5pAaBYAimfGPSV7U6+wbwxSunRY1Npi+QmE0GDIV4vP/xSQAQX/k6r94FV1EBfvTiDtGwqP2Sg/S1j1g5fviCdq9seHxB/OA3O+CwmUQZ+n1BOAstePQ36o6H2WRAY50Lz77xieiAbd5xHEV2C072eJGfn4ci+8jm0Xl2dLx8/mHsO9aH66+chnB4pL3I7oM9mNFQgnd3HEfLJbXY/lknzqt34VC7W9SpGQ0l2HOoB0vmTsWWne1YMLsOW3a2o7aiED7/EEwmE854AwlfiQQHnOgeQG2lQ1G9TARnNCDflIfXNx8Ux2fXvtOoKrXjVI8PByXfbWQ8lX0psazYgspSO178Yxt2H+zBXz/pwPT6EvDg8OgLSXQjEqjZd/wMdh/sgc8/jE+P9GL3wR7s/UcP7lh2AT7a3x01frUVNgCcOLbn1BaLc3dOTRGWzWvAKxv3472PRl9KMpuMmr5II7VlwlopLrTgp2s/0kwGYOwrNHsO9WB6fQlefKctqRzpvkzXc9aPLvcg9hzqQanTglsXnY+fSW6noz4r35j2y3mZIrVl186ZAp4HfvN28jFRg063H7sPdYv7/WUXTMLTr30Mt3cIJ7u9+KfLp2DmtBLcuvC8kUOMgj7OWd8Q3t3RLu4pnzu/HC2fq4EvMIz3PzoZ9W/TeYlt3JcTI8TV0SnFot0UUNpmsSoHMLrnlzotuLHlHJQUWtBQ7dRkL4mlzGnBVZ+rQW1lIV7esG+MH6qkHOPpDGcYfclyZkMppte7dBkT8IApz4CfvvoxTnb343qF/c9ECPv85h3HRZ+kpsKh+TiUOS1wOix4/6OTKC4048aWc3BevQvrtx7CNbPrMXNaCb545TQ0THKoHmCQnptubDkHFzeW48U/7sOft7dj297TOH9KCcrSaKGQKoKP+rs/H8CC2XVYt+WQ6AtNmexEuQb7mnA2+qfLp2BoOIzXNx/UZC8ZD61euh1Xjvw8bNl5Av2+oOpjwsqLpkoyYV/BO3LkCB544AG43W44nU6sWbMG9fX1eoulKcKLZu1d3qhSoQpXIzp6fVGpi+CBYke+aqUBwu1o6/wGMaPiX5bMwPNxbmbvvvEiRSLHwu2OELl/8tWRW7eln5+K+qpCADzsBfkYCoWjGqD3DwRRXWZXtu+KJGofe+PUdcanS8qzVI5rZtfhTL+2qdfxsmyWfn6qNingkttPaSmdIEeBOQ9GIweDgQOP6Iy4Hrcfm7Ydw33/qxke35B4cOgfCGLd5gO4Y9kMrN96KEqvDZGMQyFrSroeF8yuwzOvfRy5fTTi3DoXeADDw2GEw2HYC/IRCIZHMxsia9bAcahRaeOX3u4I4yP0UNKi7l6apVfqtOCGK6fB6w/i8KmzSXVDfPkuTkp0/0AQU6scY9d1GGiY5ECFqwDn1BYjFAqhpvwivPynfbjhqkbxO7c2j5RSmk1GdLr1XSu1lXbUlNs1z6aIfYWmtblBXslITC+ryhJ5jeudDrO4dlqaa8e8ZlhgMaK6zA7PYBADgZBuN4TCTbGwVm66+lzdSpzEsgDJ648AxJtiADivrhgldmWdVqc9+sZ+wew6vPTHNtz2T9M1z6iQZmOdU1MEl8OseRlN7FpZPqdRFzmA6GyKQDCMn637RLO9JB59noCYMSn4oQYD0HxuOUpU8kPjIX20R1oyr8eYCDZEy7IiYZ9P1DpAs3HggRJHvlhCHAiGsXbTSOaTUn1Y5VJVaosrB6BPw21pg3pBhrgtAVRAmp0mbU2i9l4yLoz0xpRmdPa4/Xjr/ZEXm9WwYUy98qoBWR2Aeuihh7BixQq0trbirbfewn/913/hhRde0FssTRFeNDPnGxEO86JTKKT4So2I2WTEnAsq1BFEmurMIapxm5qBMMEZ9wwEsX7roahg1LNv7In0FvFgbSToEK8BupKNuKXOjjQAcfuSGTqlPI/IIfRYae/0aiqH9LAknRuhHEptOZx285hSOkEODsDB424EgmG8t6t9TMBlxcLzUGwzodhuwt03jgQrhH/zu3cPjPR4KreLvV9cDjNqyu1o7/ICHB/l8El1Yem8qThy8iz8QyFsFOfGM+KYSkr6BM6rc6rTxF4SMBXGRzjMxjrIFzeWoaLIrOiGG5v6XVxYgMdf2TVurb3wc4lSou2WvPjrOtI3TxzLUkTKv7qiArVL502NlAqEdV0rN1zViKMdHp16ZPBRYyLbfkkalJeV2dHd3T/uxxUW5KFhUqHYs0b6mqEwHye6vNioY9AHkDS8j6yVzj6fbiVOQuBYeLLZpzcAABKSSURBVP1RKzliL5oEu/biH9u0LdmQ+BulTkt0EFnjQ7V0regZ+ImamysatN1LYvD4guJro4DOh1k++iVLvcYEiD5galZWFNnnxSbKMa0DBDQZB0kJsaCjUrSy5QYDx4QcwIiPKm1Qr7kMkfVRXlyAnfu72QqASPwJ4f/1kGF6bRGeuPdKnO71qt7jkZkXkTUgawNQvb29+Oyzz/DrX/8aALBkyRKsWrUKfX19cLlcOkunMfzIKz2HDNGHbK1reMXDPjCmcZtagTDBGT92ul+8nRaCUa3zR/9b2HAVb4Aei8TZCYf50RsnFbPA5MghDdBpqRexWTZayyENxIHjo+QAIL6KljRIygMz6py4/9ZmeP3BuM2GBYdaeJ2ytz+A/cfOiA6fNBgV9dkSHdWkyeE44yM9zEod5JbmGihdryxtTOoZCMIzkDywJIy18HM9bn9mN+v8yGt8YR5RGSWsrJWjHWfx5+1jx+JrN1yofjaFPTrLRlX7xY9mp3kDITz16kdiJl7U+oj0YNPLQY5dK3/efkyXuYkKHGPk9UfN9paYiybpzb3WWS6CvyHNmtPjUB21VvQM/MT0O9LzIOn2DunTYDoRfPKXLLVCrwOmtAk6oL9usKKjLMghnRvddDOSnSZcBGmdvck8PDC53I58jhf/X63PYSHrSyuyNgDV0dGBiooKGI0jBtVoNKK8vBwdHR2yA1AlJXY1RdSMsjIHSsI8PL4gPjvSmzCD4fwpJTAYONXkKAnzuOeWWfjN25+KN9XxDpP33DILU6qLFZMlEIKY2TNyIzzqICd67c4XDKGhVr1A5VCYw6ETblGOeAGO8pIClJU6VJNBKocwDnroRSCEMXOjpRwlLjtO9w6goydaRwBEOSCxQdKWz9WgrGzURpTJ/LwyADXD4dGSPCCqcXnUZ0vGRIu1Eg/p+Pxi/SdjZPjGTU2oKrUpLoNgL452nBXHJF5g6ZLzKzGt2il+vvBzj7+yS0yJvueWWWisc6Us4xDP4RdvfjKaWaOTjgpEr5WRkkKpDAaOw/R6F8pK1d27SsI8TvX6MrZfZWXy7VsZRl43u33xjNHXDGPsebygz11fulD1NSIgXSvrtx7WZW6A6L1WyBQT1sv5U0ows6EUeXlx+swpwBAfx64hOsvlwnNK097b5OpMtP2If4isLLFH2XA1iFor0E8OYHRu9NpLpHL84s34e4kaMsjRGem+oceYiHK47GiodqKvfxAuR4Eqe2s8ip02fOOmJvx2w9hsRa11gwUdrSy1MyEHMDo3T0kCk3rpZm3vAM6rd8EfGEZFiQ2Ty+yaysAyqfgzGX2OJp+iPxzPZ2cb7r179+L+++/H22+/Lf7Ztddei8ceewwzZsyQ9Tt6e70Ih7Py64uUlTlGyxwMwPEuHx79zfYxDpBmXfQjLxT5giF8//9uE1PkW5pr1bsZ5YDDp/rFem5gNLPFwHFRTyIDGo0HB7T3+MR0+LfeO6zPnETkaDt6Rp9xiMjQdvzsSOYAoJ8cBuDTo26c7B4Q/0jIdBH67khvfTJ+aSzyeS//aR9uXzIDj0fSvQ0cJ352rI6qvlZkyit9ma6mzIqyEoescqqUibxmt3N/N97b1T7mife7b7wIM+qcY8dAeAkt01uiiG52uX1Yu/GAvjZDIk97lze18VABj38Y9z/zQdr2K2pvSoWITnz3F3+Nmg9BDuHVQyHoM+eCCs1KaEQia2XM655aPpccWQNefxBmU96YrEw1P7ft+Fm8tKFNtGtKrZWUdSaiKz95ZRcTa0W1vUQukpc0hVdNY1sOaIJER2L3kvFegk2VlHRGqX0jW9HLZsTIoLeOlpU50N3Tr7scUeS6bjJO2v5MjmMwcAmTfbI2ANXb24uFCxdi27ZtMBqNCIVCmD17NjZu3Cg7A2rCBaAANhxjIO6T4mo/Ye71D2NgKIQzHj86e32S/jqhMSmlmoyH5EAf6xxrOicG4OAJD071DOgzDoA4Pyd6BtDZ62NCjrP9AfgjuqGaAxJxKgaDw+h2B/DKn/Zh6byp4ICYHlA66WgCeWOdIFU3X0kAeWOkab+aB5Z4n+/1D+MfHf14ddN+/ecjRh7Nx0Mih3CITMd+ZaQzks8W5mOj5BEB3ddJRMacPTDE2LVnFPI30tIZDYMd48nAxGGWFb3USA46GGYhOuuoqDOsrBWCecjOpMeEDEABwG233YYvfelLYhPy119/HS+++KLsn5+QASiAHaOqlxyRA5xncBihUAj2gnwMDoXgDwyjtNCs+W2P7jdOETmEMQkGQyi266QXEjl0mY8YOQaGQhgeDiMwFFJfFokuFJhNGAqFo17B001HZaD65suCXsTMj+7zwYIdz8B+ZawzceYjFArBajbpZ0eJsSiop5lkzbGyVnT3u3IMOhgSqUI6Q6QK6Ux6TNgA1OHDh/HAAw/A4/GgsLAQa9aswdSpU2X//IQNQBFEEkhniFQhnSFShXSGSBXSGSJVSGeIVCGdIVKFdCY9kgWgsrYJOQA0NDTgtdde01sMgiAIgiAIgiAIgiAIIgnqPJNCEARBEARBEARBEARBEBEoAEUQBEEQBEEQBEEQBEGoCgWgCIIgCIIgCIIgCIIgCFWhABRBEARBEARBEARBEAShKhSAIgiCIAiCIAiCIAiCIFSFAlAEQRAEQRAEQRAEQRCEqlAAiiAIgiAIgiAIgiAIglAVCkARBEEQBEEQBEEQBEEQqpKntwB6YjBweougCBPlexDaQTpDpArpDJEqpDNEqpDOEKlCOkOkCukMkSqkM6mTbMw4nud5DWUhCIIgCIIgCIIgCIIgcgwqwSMIgiAIgiAIgiAIgiBUhQJQBEEQBEEQBEEQBEEQhKpQAIogCIIgCIIgCIIgCIJQFQpAEQRBEARBEARBEARBEKpCASiCIAiCIAiCIAiCIAhCVSgARRAEQRAEQRAEQRAEQagKBaAIgiAIgiAIgiAIgiAIVaEAFEEQBEEQBEEQBEEQBKEqFIAiCIIgCIIgCIIgCIIgVIUCUFnMkSNHcPPNN2PhwoW4+eabcfToUb1FInRmzZo1aGlpwbnnnosDBw6If55MV0iPcpszZ87gK1/5ChYuXIilS5fi7rvvRl9fHwDg448/xrJly7Bw4UL867/+K3p7e8WfS/Z3xMTnrrvuwrJly3DddddhxYoVaGtrA0C2hkjO008/HbU/kY0hEtHS0oJFixahtbUVra2t+Mtf/gKAdIZITCAQwEMPPYRrrrkGS5cuxfe+9z0AtC8R8Tlx4oRoX1pbW9HS0oJLL70UAOmM6vBE1nLbbbfxb775Js/zPP/mm2/yt912m84SEXqzfft2/tSpU/xVV13F79+/X/zzZLpCepTbnDlzhv/b3/4m/v8Pf/hD/j//8z/5UCjEX3311fz27dt5nuf5Z555hn/ggQd4nueT/h2RG3g8HvG/N23axF933XU8z5OtIRKzd+9e/o477hD3J7IxRDJi/RieT64XpDPEqlWr+EceeYQPh8M8z/N8d3c3z/O0LxHyWL16Nf/973+f53nSGbWhAFSW0tPTwzc3N/PDw8M8z/P88PAw39zczPf29uosGcECUsctma6QHhGxbNiwgb/99tv53bt384sXLxb/vLe3l29qauJ5nk/6d0TusX79ev76668nW0MkJBAI8DfddBPf3t4u7k9kY4hkxAtAkc4QifB6vXxzczPv9Xqj/pz2JUIOgUCAnz17Nr93717SGQ3I0zsDi0iPjo4OVFRUwGg0AgCMRiPKy8vR0dEBl8uls3QESyTTFZ7nSY8IkXA4jFdeeQUtLS3o6OjApEmTxL9zuVwIh8Nwu91J/87pdOohOqEDDz74ID744APwPI9f/vKXZGuIhDzxxBNYtmwZqqurxT8jG0OMx3333Qee59Hc3Ix7772XdIZISHt7O5xOJ55++mls27YNNpsN3/zmN2GxWGhfIsZl8+bNqKiowIwZM7B3717SGZWhHlAEQRAEAGDVqlWwWq249dZb9RaFyAIeeeQRbN26Fffccw9+9KMf6S0OwSgfffQR9u7dixUrVugtCpFF/Pa3v8Xvf/97rFu3DjzP4+GHH9ZbJIJhQqEQ2tvbcf755+ONN97Afffdh2984xvw+Xx6i0ZkAevWrcMNN9ygtxg5AwWgspSqqip0dnYiFAoBGDG8XV1dqKqq0lkygjWS6QrpESGwZs0aHDt2DD/96U9hMBhQVVWFU6dOiX/f19cHg8EAp9OZ9O+I3OO6667Dtm3bUFlZSbaGGMP27dtx+PBhfOELX0BLSwtOnz6NO+64A8eOHSMbQyREsA35+flYsWIFdu3aRfsSkZCqqirk5eVhyZIlAICLLroIxcXFsFgstC8RSens7MT27duxdOlSAHRu0gIKQGUpJSUlmD59Ov7whz8AAP7whz9g+vTplP5HjCGZrpAeEQDwk5/8BHv37sUzzzyD/Px8AMAFF1wAv9+PHTt2AADWrl2LRYsWjft3xMRnYGAAHR0d4v9v3rwZRUVFZGuIuPzbv/0b/ud//gebN2/G5s2bUVlZieeeew533nkn2RgiLj6fD/39/QAAnufxzjvvYPr06bQvEQlxuVyYPXs2PvjgAwAjL5X19vaivr6e9iUiKevXr8cVV1yB4uJiAHRu0gKO53lebyGI9Dh8+DAeeOABeDweFBYWYs2aNZg6dareYhE6snr1amzcuBE9PT0oLi6G0+nE22+/nVRXSI9ym4MHD2LJkiWor6+HxWIBAFRXV+OZZ57Brl278NBDDyEQCGDy5Ml47LHHUFpaCgBJ/46Y2PT09OCuu+7C4OAgDAYDioqKcP/992PGjBlka4hxaWlpwbPPPovGxkayMURc2tvb8Y1vfAOhUAjhcBgNDQ347ne/i/LyctIZIiHt7e34zne+A7fbjby8PHzrW9/CFVdcQfsSkZSFCxfiwQcfxPz588U/I51RFwpAEQRBEARBEARBEARBEKpCJXgEQRAEQRAEQRAEQRCEqlAAiiAIgiAIgiAIgiAIglAVCkARBEEQBEEQBEEQBEEQqkIBKIIgCIIgCIIgCIIgCEJVKABFEARBEARBEARBEARBqAoFoAiCIAiCIFRm8eLF2LZtW8o/98ADD+Dxxx9XQSKCIAiCIAhtydNbAIIgCIIgiInO22+/rbcIBEEQBEEQukIZUARBEARBEARBEARBEISqUACKIAiCIAhCZVpaWvDhhx/iqaeewje/+U38x3/8By6++GIsXrwYn3zyifjvPvvsM1x//fW4+OKL8a1vfQuBQCDq92zZsgWtra343Oc+h+XLl2Pfvn0AgHfeeQctLS3wer0AgPfeew9z585FX1+fdl+SIAiCIAgiCRSAIgiCIAiC0JDNmzdj8eLF2LFjB1paWrBq1SoAwNDQEL7+9a+jtbUVf//737Fo0SJs3LhR/LnPPvsM3/nOd/Dwww9j27ZtuPnmm3HXXXdhaGgI1157LS6++GKsXr0aZ86cwYMPPojVq1fD5XLp9TUJgiAIgiCioAAUQRAEQRCEhjQ3N+OKK66A0WhEa2urmMW0e/duBINB3H777TCZTFi0aBFmzpwp/tyrr76Km2++GRdddBGMRiOuv/56mEwmfPzxxwCAhx56CH/729/wz//8z2hpacFVV12ly/cjCIIgCIKIBzUhJwiCIAiC0JDS0lLxvy0WCwKBAIaHh9HV1YWKigpwHCf+/aRJk8T/PnXqFN5880289NJL4p8Fg0F0dXUBAAoLC7Fo0SL8+te/xpNPPqnBNyEIgiAIgpAPBaAIgiAIgiAYoKysDJ2dneB5XgxCnTp1CjU1NQCAqqoqfPWrX8XXvva1uD/f1taGdevWYcmSJVi9ejWee+45zWQnCIIgCIIYDyrBIwiCIAiCYICmpibk5eXhhRdeQDAYxMaNG6MalN94441Yu3Ytdu/eDZ7n4fP5sHXrVni9XgQCAXz729/GPffcg0cffRRdXV347W9/q+O3IQiCIAiCiIYCUARBEARBEAyQn5+Pp556CuvXr8ell16Kd955BwsWLBD/fubMmVi1ahUefvhhXHLJJbjmmmvwxhtvAAD++7//G5WVlVixYgXy8/Px2GOP4YknnsDRo0d1+jYEQRAEQRDRcDzP83oLQRAEQRAEQRAEQRAEQUxcKAOKIAiCIAiCIAiCIAiCUBUKQBEEQRAEQRAEQRAEQRCqQgEogiAIgiAIgiAIgiAIQlUoAEUQBEEQBEEQBEEQBEGoCgWgCIIgCIIgCIIgCIIgCFWhABRBEARBEARBEARBEAShKhSAIgiCIAiCIAiCIAiCIFSFAlAEQRAEQRAEQRAEQRCEqlAAiiAIgiAIgiAIgiAIglCV/w+RsRrsfAHgPAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 1440x432 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from datetime import datetime, timedelta\n",
    "import seaborn as sns\n",
    "sns.set(rc={'figure.figsize':(20,6)})\n",
    "\n",
    "# Convert columns to datetime\n",
    "trips[\"starttime\"] = pd.to_datetime(trips[\"starttime\"], format=\"%Y-%m-%d %H:%M:%S\")\n",
    "trips[\"stoptime\"] = pd.to_datetime(trips[\"stoptime\"], format=\"%Y-%m-%d %H:%M:%S\")\n",
    "\n",
    "start_date = datetime.strptime(\"2013-06-01 00:00:01\", \"%Y-%m-%d %H:%M:%S\")\n",
    "end_date = datetime.strptime(\"2013-07-01 00:10:34\", \"%Y-%m-%d %H:%M:%S\")\n",
    "interval = timedelta(minutes=60)\n",
    "bucket_elements = []\n",
    "while start_date <= end_date:\n",
    "    # Check how many trips fall into this interval\n",
    "    bucket_elements.append(trips[((start_date + interval) >= trips[\"stoptime\"])\n",
    "                                  & (start_date <= trips[\"stoptime\"])].shape[0])\n",
    "    # Increment\n",
    "    start_date += interval\n",
    "\n",
    "sns.scatterplot(x=\"index\", y=\"trips_per_hour\", data=pd.DataFrame(bucket_elements, columns=[\"trips_per_hour\"]).reset_index())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MzozQUKJtJCq"
   },
   "source": [
    "Chosing the right interval-size is an important design decision in temporal models. If it's too large, we will aggregate to much information and lose the fine grained information over time. If it's too small, we have a lot of noise in the data and the model might not be able to identify clear patterns. \n",
    "\n",
    "We can see a clear trend here - at night there are way less bikers on the streets. Also, it seems that there is a global trend, which results in more and more bikers in total over time (probably because the weather is getting warmer). \n",
    "\n",
    "So is this the right step-size? We can't really say yet - we have to try it out. But what we can see is that the number of bikers slowly changes over the day and therefore we probably don't aggregate too much information. Feel free to play around with the stepsize above - enter for example 600 (= 10 hours) or 10 (= 10 mins). Eventually it's also important to consider how far in the future we want to predict later."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1M_X8MMq-0hu"
   },
   "source": [
    "`Step 4`: Extract the node features\n",
    "\n",
    "\n",
    "**What type of temporal GNN dataset do I have?**\n",
    "\n",
    "Another important point is how our graph changes over time. It can happen, that not all nodes are present in the next time-step, which makes it a bit difficult as our edge indices are shifted. Below I have some hints on how to cope with each of the graph types. \n",
    "\n",
    "`Static Graph Temporal Signal` \n",
    "\n",
    "This is simply a graph that always stays the same and only the label information changes. Common examples are road networks (the nodes/edges do not suddently disappear in a long-range time horizon) or electricity networks. \n",
    "For this we can either use [pytorch geometric temporal](https://pytorch-geometric-temporal.readthedocs.io/en/latest/notes/installation.html) or also only [pytorch geometric](https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.TemporalData), which recently added temporal graph support (but only dynamic edges).\n",
    "\n",
    "This is by far the easiest graph type, because the node feature matrix and edge_index stay untouched. We only have to adjust the labels in each graph snapshot.\n",
    "\n",
    "`Dynamic Graph Temporal/Static Signal`\n",
    "\n",
    "The more difficult graph type is a dynamic graph (with regards to nodes/edges). \n",
    "This typically happens in social networks, that quickly grow/shrink over time, but also transaction systems like crypto networks.\n",
    "Previously I mentioned that the node ordering is implicitly defined by the node feature matrix - but what if this matrix changes. \n",
    "\n",
    "`Option 1` 😸\n",
    "\n",
    "The easiest solution is just to append new nodes (in it's temporal order at preprocessing-time) to this matrix and always use the full matrix (all nodes) even if not all of them are part of the current snapshot. This matrix can (depending on your nodes) get very big. Therefore I suggest to check what your maximum number of nodes is and if it's feasible to store your total node_feature matrix in memory. \n",
    "If your node_features change over time, that's also no problem - you can simply update the node feature matrix for each snapshot. The important point for this option is that the ordering of the nodes always stays the same! This means index 0 is always the same node (e.g. location), even if it's not really used in this snapshot.\n",
    "\n",
    "\n",
    "`Option 2` 😵\n",
    "\n",
    "Option 1 will lead to a very big matrix with a lot of redundancies - but has one advantage: You always have the same index for a specific node. It can happen that a single node only occurs once and then is carried through all other snapshots, even if it's never used again. If you want to make your graph *really* dynamic, you should also be able to remove nodes. This however will affect the edge_index, because the indices of the node_feature matrix change!\n",
    "Therefore we would need to re-index the edge_index to point to the correct nodes. But besides this there is another issue.\n",
    "\n",
    "In spatio-temporal GNN models, you typically update the node feature embeddings over time (with a recurrent unit). The problem really is that you need to update nodes between two snapshots, based on their indices -> But what if a node is not present anymore. So as you can see it's not that trivial to learn on a dynamic graph. Therefore I'd suggest to make the first Option work. :)\n",
    "\n",
    "\n",
    "---\n",
    "\n",
    "In our case we have a `Static Graph Temporal Signal`, because our locations/edges do not really change over time. But as we will see later, when incorporating the labels as edge_features (for historical snapshots), we will end up with a `Dynamic Graph Temporal Signal`. \n",
    "\n",
    "\n",
    "> Coming back to the node features...\n",
    "\n",
    "We don't really have information about the nodes (locations) here but we need them in order to apply message passing. If really nothing comes to your mind, I would probably assign random values (gaussian samples) to the node feature matrix.\n",
    "\n",
    "Here we can model the outgoing / incoming traffic as an attribute for the nodes. This allows us to say \"this is a heavily frequented location\" or not. \n",
    "We can calculate this information based on the dataset, but here it's always important to consider that we should apply a train/test split in advance (which I didn't do here) to leak no information into the train set. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "qdznJZjz-0hu",
    "outputId": "34019ead-21f5-480b-eab3-184a5d31960a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Full shape:  (337, 2)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([[0.76524844, 0.79829787],\n",
       "       [0.39452544, 0.36964539],\n",
       "       [0.16691461, 0.18751773],\n",
       "       [0.92740256, 0.86297872],\n",
       "       [0.6155906 , 0.62014184],\n",
       "       [0.27700089, 0.28028369],\n",
       "       [0.62600417, 0.62099291],\n",
       "       [0.27848855, 0.22978723],\n",
       "       [0.12734305, 0.11177305],\n",
       "       [0.2692651 , 0.24879433]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "# Find out how many outgoing bikers we have\n",
    "outgoing_trips = trips.groupby(\"start station id\").count()[\"bikeid\"].values\n",
    "incoming_trips = trips.groupby(\"end station id\").count()[\"bikeid\"].values\n",
    "\n",
    "# Normalize features between 0 and 1\n",
    "outgoing_trips = (outgoing_trips - np.min(outgoing_trips)) / (np.max(outgoing_trips) - np.min(outgoing_trips))\n",
    "incoming_trips = (incoming_trips - np.min(incoming_trips)) / (np.max(incoming_trips) - np.min(incoming_trips))\n",
    "\n",
    "# Build node features\n",
    "node_features = np.stack([outgoing_trips, incoming_trips]).transpose()\n",
    "print(\"Full shape: \", node_features.shape)\n",
    "node_features[:10] # [num_nodes x num_features]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rlF8gl4r-0hv"
   },
   "source": [
    "That's a simple example for a possible node feature matrix. As already mentioned - you can be creative here. :)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ML5jcafQTjxk"
   },
   "source": [
    "`Step 5`: Extract the edges\n",
    "\n",
    "As mentioned previously, we want to create the edges based on the distances of the locations, as this is all information we have in this dataset. The hope is, that the GNN can later identify \"crowded\" places, where the tripduration will most likely take longer (due to traffic).\n",
    "\n",
    "The topology of our graph is therefore **static** in this dataset, because the edges don't change and also the nodes are always the same. Therefore we can pre-compute this part of the graph. In fact, the only temporal part here are the labels (tripdurations), as we will see further below. \n",
    "\n",
    "Therefore, we only calculate the edges once here for all snapshots. This can be efficiently done by building all combinations (the cartesian product) and applying the distance calculation on each row. **Remember: when dealing with dataframes there is almost always a better way than using a for-loop** 😀 \n",
    "\n",
    "But there was one problem I encountered during modelling the labels. I planned to use past labels (aka historical trip durations) as edge features, as this will add a lot of information to the graph. This basically tells us \"this is the current trip duration for some parts of the graph (edge features), please predict 1 hour into the future for other parts of the graph\" (labels).\n",
    "Edge features however can only \"live\" on existing edges and unfortunately the static edges here (based on the location) are different from the bike-trips (which are not distance-based). Because of that I will extend the edge index later by the edges for the bike trips and will use edge features to signal that these added edges have different properties than the once I add here.\n",
    "Consequentely, we will use a `Dynamic Graph Temporal Signal` later.\n",
    "Note that this situation comes from the fact that we have two edges types - one for the distance-based edges and one for the trips. We could also model this as a heterogeneous graph instead of providing the information as edge features.\n",
    "\n",
    "\n",
    "> Edge feats: [distance, edge_type, historical trip duration]\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "1pLlmClpWrp-",
    "outputId": "e02507bd-8d22-4689-e2cf-da5bdba1a51b"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-2afccbec-875e-48ad-be6e-53ff49cfeff0\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station id</th>\n",
       "      <th>distance</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>0</td>\n",
       "      <td>1229.224446</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>1</td>\n",
       "      <td>1618.732623</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>2</td>\n",
       "      <td>4307.215223</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.990741</td>\n",
       "      <td>40.734546</td>\n",
       "      <td>3</td>\n",
       "      <td>1452.477060</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.983799</td>\n",
       "      <td>40.726218</td>\n",
       "      <td>118</td>\n",
       "      <td>2522.253218</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2afccbec-875e-48ad-be6e-53ff49cfeff0')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-2afccbec-875e-48ad-be6e-53ff49cfeff0 button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-2afccbec-875e-48ad-be6e-53ff49cfeff0');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   end station longitude  end station latitude  end station id  \\\n",
       "0             -74.003664             40.743174             299   \n",
       "1             -74.003664             40.743174             299   \n",
       "2             -74.003664             40.743174             299   \n",
       "3             -74.003664             40.743174             299   \n",
       "4             -74.003664             40.743174             299   \n",
       "\n",
       "   start station longitude  start station latitude  start station id  \\\n",
       "0               -73.989151               40.742354                 0   \n",
       "1               -73.987586               40.735243                 1   \n",
       "2               -74.016777               40.705693                 2   \n",
       "3               -73.990741               40.734546                 3   \n",
       "4               -73.983799               40.726218               118   \n",
       "\n",
       "      distance  \n",
       "0  1229.224446  \n",
       "1  1618.732623  \n",
       "2  4307.215223  \n",
       "3  1452.477060  \n",
       "4  2522.253218  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.utils.extmath import cartesian\n",
    "from geopy.distance import geodesic\n",
    "\n",
    "# Get all possible start locations and their geo info\n",
    "subset = [\"start station longitude\", \"start station latitude\", \"start station id\"]\n",
    "all_starts = trips.drop_duplicates(subset=\"start station id\", keep=\"first\")[subset]\n",
    "# Get all possible end locations and their geo info\n",
    "subset = [\"end station longitude\", \"end station latitude\", \"end station id\"]\n",
    "all_ends = trips.drop_duplicates(subset=\"end station id\", keep=\"first\")[subset]\n",
    "# Combine all combinations in one dataframe\n",
    "distance_matrix = all_ends.merge(all_starts, how=\"cross\")\n",
    "distance_matrix[\"distance\"] = distance_matrix.apply(lambda x: geodesic((x[\"start station latitude\"], x[\"start station longitude\"]), \n",
    "                                                          (x[\"end station latitude\"], x[\"end station longitude\"])).meters, axis=1)\n",
    "distance_matrix.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Sxp2HR7hblXE"
   },
   "source": [
    "Based on the new \"distance\" column we can now create edges. For this we can select a threshold to only connect nodes that are close to each other. Alternatively, you can also take all edges and assign a weight to each edge in the graph.\n",
    "\n",
    "The distance column is based on meters and here I'll just use 500 meteres as a cutoff (simply plot a histogram to find a good threshold). This will also generate self-loops (if you dont want that, simply also add > X)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 206
    },
    "id": "ZvbyPPa-ZwoW",
    "outputId": "afceb6b0-57f2-4047-a95c-3d655b841e00"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "  <div id=\"df-62ac1ef6-2638-4dbe-ae2f-7297d6e02e4e\">\n",
       "    <div class=\"colab-df-container\">\n",
       "      <div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>end station longitude</th>\n",
       "      <th>end station latitude</th>\n",
       "      <th>end station id</th>\n",
       "      <th>start station longitude</th>\n",
       "      <th>start station latitude</th>\n",
       "      <th>start station id</th>\n",
       "      <th>distance</th>\n",
       "      <th>edge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.989151</td>\n",
       "      <td>40.742354</td>\n",
       "      <td>0</td>\n",
       "      <td>1229.224446</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.987586</td>\n",
       "      <td>40.735243</td>\n",
       "      <td>1</td>\n",
       "      <td>1618.732623</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-74.016777</td>\n",
       "      <td>40.705693</td>\n",
       "      <td>2</td>\n",
       "      <td>4307.215223</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.990741</td>\n",
       "      <td>40.734546</td>\n",
       "      <td>3</td>\n",
       "      <td>1452.477060</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-74.003664</td>\n",
       "      <td>40.743174</td>\n",
       "      <td>299</td>\n",
       "      <td>-73.983799</td>\n",
       "      <td>40.726218</td>\n",
       "      <td>118</td>\n",
       "      <td>2522.253218</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>\n",
       "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-62ac1ef6-2638-4dbe-ae2f-7297d6e02e4e')\"\n",
       "              title=\"Convert this dataframe to an interactive table.\"\n",
       "              style=\"display:none;\">\n",
       "        \n",
       "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
       "       width=\"24px\">\n",
       "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
       "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
       "  </svg>\n",
       "      </button>\n",
       "      \n",
       "  <style>\n",
       "    .colab-df-container {\n",
       "      display:flex;\n",
       "      flex-wrap:wrap;\n",
       "      gap: 12px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert {\n",
       "      background-color: #E8F0FE;\n",
       "      border: none;\n",
       "      border-radius: 50%;\n",
       "      cursor: pointer;\n",
       "      display: none;\n",
       "      fill: #1967D2;\n",
       "      height: 32px;\n",
       "      padding: 0 0 0 0;\n",
       "      width: 32px;\n",
       "    }\n",
       "\n",
       "    .colab-df-convert:hover {\n",
       "      background-color: #E2EBFA;\n",
       "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
       "      fill: #174EA6;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert {\n",
       "      background-color: #3B4455;\n",
       "      fill: #D2E3FC;\n",
       "    }\n",
       "\n",
       "    [theme=dark] .colab-df-convert:hover {\n",
       "      background-color: #434B5C;\n",
       "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
       "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
       "      fill: #FFFFFF;\n",
       "    }\n",
       "  </style>\n",
       "\n",
       "      <script>\n",
       "        const buttonEl =\n",
       "          document.querySelector('#df-62ac1ef6-2638-4dbe-ae2f-7297d6e02e4e button.colab-df-convert');\n",
       "        buttonEl.style.display =\n",
       "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
       "\n",
       "        async function convertToInteractive(key) {\n",
       "          const element = document.querySelector('#df-62ac1ef6-2638-4dbe-ae2f-7297d6e02e4e');\n",
       "          const dataTable =\n",
       "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
       "                                                     [key], {});\n",
       "          if (!dataTable) return;\n",
       "\n",
       "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
       "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
       "            + ' to learn more about interactive tables.';\n",
       "          element.innerHTML = '';\n",
       "          dataTable['output_type'] = 'display_data';\n",
       "          await google.colab.output.renderOutput(dataTable, element);\n",
       "          const docLink = document.createElement('div');\n",
       "          docLink.innerHTML = docLinkHtml;\n",
       "          element.appendChild(docLink);\n",
       "        }\n",
       "      </script>\n",
       "    </div>\n",
       "  </div>\n",
       "  "
      ],
      "text/plain": [
       "   end station longitude  end station latitude  end station id  \\\n",
       "0             -74.003664             40.743174             299   \n",
       "1             -74.003664             40.743174             299   \n",
       "2             -74.003664             40.743174             299   \n",
       "3             -74.003664             40.743174             299   \n",
       "4             -74.003664             40.743174             299   \n",
       "\n",
       "   start station longitude  start station latitude  start station id  \\\n",
       "0               -73.989151               40.742354                 0   \n",
       "1               -73.987586               40.735243                 1   \n",
       "2               -74.016777               40.705693                 2   \n",
       "3               -73.990741               40.734546                 3   \n",
       "4               -73.983799               40.726218               118   \n",
       "\n",
       "      distance   edge  \n",
       "0  1229.224446  False  \n",
       "1  1618.732623  False  \n",
       "2  4307.215223  False  \n",
       "3  1452.477060  False  \n",
       "4  2522.253218  False  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "distance_matrix[\"edge\"] = distance_matrix[\"distance\"] < 500\n",
    "distance_matrix.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "fcZLqJ4tf5ut"
   },
   "source": [
    "Now we are almost there! We just need a way to build the edge_index. For this, we need to consider the original ordering in the node feature matrix. Remember when we did the mapping of the indices before? - Because of that we already have the edge indices set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "MurNb-gyW1bi",
    "outputId": "f6aab419-2c9d-4ef5-d9b5-25d01178759e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  4,   6,  15, ..., 315, 317, 327],\n",
       "       [299, 299, 299, ..., 272, 272, 272]])"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Use mask to extract static edges\n",
    "edge_index = distance_matrix[distance_matrix[\"edge\"] == True][[\"start station id\", \"end station id\"]].values\n",
    "edge_index = edge_index.transpose()\n",
    "edge_index # [2 x num_edges]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "j0ScfTg0psOP",
    "outputId": "253b8311-caec-456d-d021-7c59af4f589e"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[294.92884499,   0.        ,   0.        ],\n",
       "       [327.27388739,   0.        ,   0.        ],\n",
       "       [419.30404213,   0.        ,   0.        ],\n",
       "       ...,\n",
       "       [379.44132958,   0.        ,   0.        ],\n",
       "       [349.85678078,   0.        ,   0.        ],\n",
       "       [332.80260533,   0.        ,   0.        ]])"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Add edge features to indicate edge type\n",
    "distance_feature = distance_matrix[distance_matrix[\"edge\"] == True][\"distance\"].values\n",
    "edge_type_feature = np.zeros_like(distance_feature) # 0 = static edge\n",
    "trip_duration_feature = np.zeros_like(distance_feature) # 0 = no information\n",
    "static_edge_features = np.stack([distance_feature, edge_type_feature, trip_duration_feature]).transpose()\n",
    "static_edge_features # [num_edges x num_features]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Tg-XuPNu-0hw"
   },
   "source": [
    "`Step 6`: Extract the labels and build the dataset\n",
    "\n",
    "Now we get in touch with the temporal aspect of our dataset. Because the edge_index and the node features were static, we could pre-compute them for each snapshot. The labels however change over time, because we have different trip durations for different bikers for each 60 min interval. In addition to that we can also use historical labels as edge features. This is a special attribute of time-series datasets - the labels become features for past timesteps. If you have a node-level prediction task you can add the historical labels as node features to each of the nodes. But this also means that you cannot pre-compute the node feature matrix and need to do it in the following loop instead.\n",
    "\n",
    "> In most of the time-series libraries this is called a \"lag\" or \"offset\". In our dataset we want to predict 1 hour into the future, based on the current situation (trip durations). Therefore, we can use the current trip durations as edge features and the trip durations of the next snapshot as targets. Of course you could also define a larger offset, for example 12 hours into the future.\n",
    "\n",
    "As mentioned previously, we have to types of edges now - static edges based on the location and edges for historical trips (for which we need to insert new edges to use the edge features). It can happen that we have multiple bikers on the same \"edge\" (=route) - here we simply average all of the trip durations between two nodes.\n",
    "\n",
    "Generally, everything that is temporal needs to be computed in a loop over the time-series. Because of that, we loop over the start and end time of our dataset and store the labels of each subsequent snapshot in a list and the current trip durations as edge features in another list.\n",
    "We also stack the pre-computed node_features and edge_index, so that the first entry in each list corresponds to the first snapshot and so on. That's at least how it is typically done in [pytorch geometric temporal](https://pytorch-geometric-temporal.readthedocs.io/en/latest/index.html). Note that there is also a TemporalData Object in plain PyG available now. \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-yLsE_o9-0hw"
   },
   "outputs": [],
   "source": [
    "def extract_dynamic_edges(s):\n",
    "    # Extract dynamic edges and their features\n",
    "    trip_indices = s[[\"start station id\", \"end station id\"]].values\n",
    "    trip_durations = s[\"tripduration\"]\n",
    "\n",
    "    # Build edge features\n",
    "    distance_feature  = pd.DataFrame(trip_indices, \n",
    "                                    columns=[\"start station id\", \"end station id\"]).merge(\n",
    "                                        distance_matrix, on=[\"start station id\", \"end station id\"], \n",
    "                                        how=\"left\")[\"distance\"].values\n",
    "    edge_type_feature = np.ones_like(distance_feature) # 1 = dynamic\n",
    "    trip_duration_feature = trip_durations\n",
    "    edge_features = np.stack([distance_feature, edge_type_feature, trip_duration_feature]).transpose()\n",
    "    return edge_features, trip_indices.transpose()\n",
    "\n",
    "\n",
    "\n",
    "start_date = datetime.strptime(\"2013-06-01 00:00:01\", \"%Y-%m-%d %H:%M:%S\")\n",
    "end_date = datetime.strptime(\"2013-07-01 00:10:34\", \"%Y-%m-%d %H:%M:%S\")\n",
    "interval = timedelta(minutes=60)\n",
    "\n",
    "xs = []\n",
    "edge_indices = []\n",
    "ys = []\n",
    "y_indices = []\n",
    "edge_features = []\n",
    "\n",
    "\n",
    "while start_date <= end_date:\n",
    "    # 0 - 60 min \n",
    "    current_snapshot = trips[((start_date + interval) >= trips[\"stoptime\"])\n",
    "                                  & (start_date <= trips[\"stoptime\"])]\n",
    "    # 60 - 120 min\n",
    "    subsequent_snapshot = trips[((start_date + 2*interval) >= trips[\"stoptime\"])\n",
    "                                  & (start_date + interval <= trips[\"stoptime\"])]\n",
    "    # Average duplicate trips\n",
    "    current_snapshot = current_snapshot.groupby([\"start station id\", \"end station id\"]).mean().reset_index()\n",
    "    subsequent_snapshot = subsequent_snapshot.groupby([\"start station id\", \"end station id\"]).mean().reset_index()\n",
    "\n",
    "    # Extract dynamic trip edges\n",
    "    edge_feats, additional_edge_index = extract_dynamic_edges(current_snapshot)\n",
    "    exteneded_edge_index = np.concatenate([edge_index, additional_edge_index], axis=1)\n",
    "    extended_edge_feats = np.concatenate([edge_feats, static_edge_features], axis=0)\n",
    "\n",
    "    # Labels\n",
    "    y = subsequent_snapshot[\"tripduration\"].values\n",
    "    y_index = subsequent_snapshot[[\"start station id\", \"end station id\"]].values\n",
    "\n",
    "    # Append everything\n",
    "    xs.append(node_features) # static\n",
    "    edge_indices.append(exteneded_edge_index) # static + dynamic\n",
    "    edge_features.append(extended_edge_feats) # static + dynamic\n",
    "    ys.append(y) # dynamic\n",
    "    y_indices.append(y_index.transpose()) # dynamic\n",
    "\n",
    "    # Increment\n",
    "    start_date += interval"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "paERmkOXqH2n"
   },
   "source": [
    "Ok so what did we just do here? \n",
    "\n",
    "After each 60 min interval, we get a subset of our dataframe for this time range. Based on that, we extract all the available edge labels between two locations. Those are the ones we want to predict later in the model.\n",
    "In order to calculate the loss only based on the edges for which we have labels, we store some sort of mask (y_index) that tells us for which source/target pairs we have labels. \n",
    "\n",
    "> Important: Here we take the label of each snapshot as target value. \n",
    "\n",
    "We could of course also normalize the labels to ensure smoother training for the regression setup. \n",
    "\n",
    "\n",
    "**Now we have everything we need! 🎉** "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "KOz6FequvHBG",
    "outputId": "e8b7767f-e70c-46b4-8bca-fa96af45e906"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Example of graph snapshot 2: \n",
      "\n",
      "      Node feature shape: (337, 2) \n",
      "\n",
      "      Edge index shape: (2, 2773) \n",
      "\n",
      "      Edge feature shape: (2773, 3) \n",
      " \n",
      "      Labels shape: (25,) \n",
      "\n",
      "      Labels mask shape: (2, 25)\n",
      "      \n"
     ]
    }
   ],
   "source": [
    "i = 2\n",
    "print(f\"\"\"Example of graph snapshot {i}: \\n\n",
    "      Node feature shape: {xs[i].shape} \\n\n",
    "      Edge index shape: {edge_indices[i].shape} \\n\n",
    "      Edge feature shape: {edge_features[i].shape} \\n \n",
    "      Labels shape: {ys[i].shape} \\n\n",
    "      Labels mask shape: {y_indices[i].shape}\n",
    "      \"\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3my7QsYp-0hy"
   },
   "source": [
    "Just like before, I won't install Pytorch Geometric Temporal here, as this will make the notebook too heavy, but here are some code snippets for the final steps.\n",
    "\n",
    "\n",
    "We need to pass the lists of numpy arrays to the Data Structures, as for example done [here](https://pytorch-geometric-temporal.readthedocs.io/en/latest/_modules/torch_geometric_temporal/dataset/wikimath.html#WikiMathsDatasetLoader).\n",
    "\n",
    "```\n",
    "from torch_geometric_temporal.signal import DynamicGraphTemporalSignal\n",
    "dataset = DynamicGraphTemporalSignal(\n",
    "            edge_indices, edge_features, xs, ys, y_indices=y_indices\n",
    "        )\n",
    "\n",
    "```\n",
    "\n",
    "For more details please have a look [at the documentation](https://pytorch-geometric-temporal.readthedocs.io/en/latest/modules/signal.html).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2LV9tCNdImzk"
   },
   "source": [
    "# 3. Other questions / Final Remarks\n",
    "\n",
    "-  `Note: I've seen that PyG has also added a helpful tutorial`\n",
    "[You can find it here.](https://https://pytorch-geometric.readthedocs.io/en/latest/notes/load_csv.html)\n",
    "\n",
    "\n",
    "- I would always put each of the calculations of node_features, labels, edge_indices ect. into separate functions\n",
    "\n",
    "- What if I have pairs of graphs (graph matching ect.) --> use Pytorch Geometrics `PairData` as described [here](https://pytorch-geometric.readthedocs.io/en/latest/notes/batching.html#pairs-of-graphs).\n",
    "\n",
    "- What if I want to represent images as graphs? --> Generally I think it's not a good idea to represent each pixle as a node, as this would require a lot of Message passing layers to learn the full image information. Instead, I'd suggest to divide the image into smaller patches (like 3x3 kernels) and represent those as nodes. The edges are then simply calculated based on the neighboring patches.\n",
    "\n",
    "- How do I use images as nodes? --> In this case I would either convert the 2D images to a 1D feature vector with a pretrained image model such as InceptionNet or would build a custom layer that applies a transformation to the images before doing message passing"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [
    "Mw8dzPy3-UnJ",
    "C4CFR0Ye_xNJ",
    "R7OBoSXFQQsJ",
    "nLCEck1Q_WRM"
   ],
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
