{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction\n",
    "\n",
    "In this tutorial, we demonstrate how to use TrajectoryNet [1] and PHATE [2] (Potential of Heat-diffusion for Affinity-based Transition Embedding) to analyze a 31,000 cell 27-day time course of embryoid body (EB) differentiation. \n",
    "\n",
    "We review the following steps:\n",
    "\n",
    "[1. Loading 10X data](#loading)  \n",
    "[2. Preprocessing: Filtering, Normalizing, and Transforming](#preprocessing)  \n",
    "[3. Embedding Data Using PHATE](#embedding)  \n",
    "[4. Evaluating Dynamics with TrajectoryNet](#trajectory)\n",
    "\n",
    "References:\n",
    "\n",
    "\n",
    "1. Tong, A., Huang, J., Wolf, G., van Dijk, D. & Krishnaswamy, S. TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics. in Proceedings of the 37th International Conference on Machine Learning (2020). [url](http://proceedings.mlr.press/v119/tong20a/tong20a.pdf)\n",
    "2. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nature Biotechnoly 37, 1482–1492 (2019). [url](https://doi.org/10.1038/s41587-019-0336-3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Time course of human embryoid body differentation\n",
    "\n",
    "Low passage H1 hESCs were maintained on Matrigel-coated dishes in DMEM/F12-N2B27 media supplemented with FGF2. For EB formation, cells were treated with Dispase, dissociated into small clumps and plated in non-adherent plates in media supplemented with 20% FBS,\n",
    "45\n",
    "which was prescreened for EB differentiation. Samples were collected during 3-day intervals during a 27 day-long differentiation timecourse. An undifferentiated hESC sample was also included (Figure S7D). Induction of key germ layer markers in these EB cultures was validated by qPCR (data not shown). For single cell analyses, EB cultures were dissociated, FACS sorted to remove doublets and dead cells and processed on a 10x genomics instrument to generate cDNA libraries, which were then sequenced. Small scale sequencing determined that we have successfully collected data on approximately 31,000 cells equally distributed throughout the timecourse.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Install PHATE\n",
    "\n",
    "In addition to cloning and installing TrajectoryNet, if you have not already installed PHATE and `scprep`, we can install them from the notebook. You may need to restart the kernel/runtime after installation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.3.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n",
      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!pip install --user --upgrade --quiet phate scprep"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='loading'></a>\n",
    "## 1. Loading 10X data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Downloading Data from Mendeley Datasets\n",
    "\n",
    "The EB dataset is publically available as `scRNAseq.zip` at Mendelay Datasets at <https://data.mendeley.com/datasets/v6n743h5ng/>. \n",
    "\n",
    "Inside the scRNAseq folder, there are five subdirectories, and in each subdirectory are three files: `barcodes.tsv`, `genes.tsv`, and `matrix.mtx`. For more information about how CellRanger produces these files, check out the [Gene-Barcode Matrices Documentation](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices).\n",
    "\n",
    "Here's the directory structure:\n",
    "```\n",
    "download_path\n",
    "└── scRNAseq\n",
    "    ├── scRNAseq.zip\n",
    "    ├── T0_1A\n",
    "    │   ├── barcodes.tsv\n",
    "    │   ├── genes.tsv\n",
    "    │   └── matrix.mtx\n",
    "    ├── T2_3B\n",
    "    │   ├── barcodes.tsv\n",
    "    │   ├── genes.tsv\n",
    "    │   └── matrix.mtx\n",
    "    ├── T4_5C\n",
    "    │   ├── barcodes.tsv\n",
    "    │   ├── genes.tsv\n",
    "    │   └── matrix.mtx\n",
    "    ├── T6_7D\n",
    "    │   ├── barcodes.tsv\n",
    "    │   ├── genes.tsv\n",
    "    │   └── matrix.mtx\n",
    "    └── T8_9E\n",
    "        ├── barcodes.tsv\n",
    "        ├── genes.tsv\n",
    "        └── matrix.mtx\n",
    "```\n",
    "\n",
    "If you have downloaded the files already, set the `download_path` below to the directory where you saved the files. If not, the following code will download the data for you. Not that the download is 746MB: you must have sufficient disk space for the download."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/home/azweig/projects/zebrafish/data\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import scprep\n",
    "download_path = os.path.expanduser(\"/home/azweig/projects/zebrafish/data\")\n",
    "print(download_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# if not os.path.isdir(os.path.join(download_path, \"scRNAseq\", \"T0_1A\")):\n",
    "#     # need to download the data\n",
    "#     scprep.io.download.download_and_extract_zip(\n",
    "#         \"https://md-datasets-public-files-prod.s3.eu-west-1.amazonaws.com/\"\n",
    "#         \"5739738f-d4dd-49f7-b8d1-5841abdbeb1e\",\n",
    "#         download_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using `scprep` to import data into Pandas DataFrames\n",
    "\n",
    "\n",
    "We use a toolkit for loading and manipulating single-cell data called `scprep`. The function `load_10X` will automatically load 10X scRNAseq datasets (and others) into a Pandas DataFrame. DataFrames are incredibly useful tools for data analysis in Python. To learn more about them, [check out the Pandas Documentation and Tutorials](https://pandas.pydata.org/pandas-docs/stable/).\n",
    "\n",
    "\n",
    "Let's load the data and create a single matrix that we can use for preprocessing, visualization, and analysis."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 1. Standard imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import phate\n",
    "import scprep\n",
    "# import magic\n",
    "import matplotlib.pyplot as plt\n",
    "import sklearn.preprocessing\n",
    "\n",
    "# matplotlib settings for Jupyter notebooks only\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 2. Use `scprep.io.load_10X` to import all three matrices into a DataFrame for each sample (this may take a few minutes)\n",
    "\n",
    "Note: By default, `scprep.io.load_10X` loads scRNA-seq data using the Pandas SparseDataFrame [(**see Pandas docs**)](https://pandas.pydata.org/pandas-docs/stable/sparse.html) to maximize memory efficiency. However, this will be slower than loading on a dense matrix. To load a dense matrix, pass the `sparse=False` argument to `load_10X`. We use `gene_labels = 'both'` so we can see the gene symbols while still retaining the uniqueness offered by gene IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>RP11-34P13.3 (ENSG00000243485)</th>\n",
       "      <th>FAM138A (ENSG00000237613)</th>\n",
       "      <th>OR4F5 (ENSG00000186092)</th>\n",
       "      <th>RP11-34P13.7 (ENSG00000238009)</th>\n",
       "      <th>RP11-34P13.8 (ENSG00000239945)</th>\n",
       "      <th>RP11-34P13.14 (ENSG00000239906)</th>\n",
       "      <th>RP11-34P13.9 (ENSG00000241599)</th>\n",
       "      <th>FO538757.3 (ENSG00000279928)</th>\n",
       "      <th>FO538757.2 (ENSG00000279457)</th>\n",
       "      <th>AP006222.2 (ENSG00000228463)</th>\n",
       "      <th>...</th>\n",
       "      <th>AC007325.2 (ENSG00000277196)</th>\n",
       "      <th>BX072566.1 (ENSG00000277630)</th>\n",
       "      <th>AL354822.1 (ENSG00000278384)</th>\n",
       "      <th>AC023491.2 (ENSG00000278633)</th>\n",
       "      <th>AC004556.1 (ENSG00000276345)</th>\n",
       "      <th>AC233755.2 (ENSG00000277856)</th>\n",
       "      <th>AC233755.1 (ENSG00000275063)</th>\n",
       "      <th>AC240274.1 (ENSG00000271254)</th>\n",
       "      <th>AC213203.1 (ENSG00000277475)</th>\n",
       "      <th>FAM231B (ENSG00000268674)</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>AAACATACCAGAGG-1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACATTGAAAGCA-1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACATTGAAGTGA-1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACATTGGAGGTG-1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACATTGGTTTCT-1</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 33694 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                  RP11-34P13.3 (ENSG00000243485)  FAM138A (ENSG00000237613)  \\\n",
       "0                                                                             \n",
       "AAACATACCAGAGG-1                             0.0                        0.0   \n",
       "AAACATTGAAAGCA-1                             0.0                        0.0   \n",
       "AAACATTGAAGTGA-1                             0.0                        0.0   \n",
       "AAACATTGGAGGTG-1                             0.0                        0.0   \n",
       "AAACATTGGTTTCT-1                             0.0                        0.0   \n",
       "\n",
       "                  OR4F5 (ENSG00000186092)  RP11-34P13.7 (ENSG00000238009)  \\\n",
       "0                                                                           \n",
       "AAACATACCAGAGG-1                      0.0                             0.0   \n",
       "AAACATTGAAAGCA-1                      0.0                             0.0   \n",
       "AAACATTGAAGTGA-1                      0.0                             0.0   \n",
       "AAACATTGGAGGTG-1                      0.0                             0.0   \n",
       "AAACATTGGTTTCT-1                      0.0                             0.0   \n",
       "\n",
       "                  RP11-34P13.8 (ENSG00000239945)  \\\n",
       "0                                                  \n",
       "AAACATACCAGAGG-1                             0.0   \n",
       "AAACATTGAAAGCA-1                             0.0   \n",
       "AAACATTGAAGTGA-1                             0.0   \n",
       "AAACATTGGAGGTG-1                             0.0   \n",
       "AAACATTGGTTTCT-1                             0.0   \n",
       "\n",
       "                  RP11-34P13.14 (ENSG00000239906)  \\\n",
       "0                                                   \n",
       "AAACATACCAGAGG-1                              0.0   \n",
       "AAACATTGAAAGCA-1                              0.0   \n",
       "AAACATTGAAGTGA-1                              0.0   \n",
       "AAACATTGGAGGTG-1                              0.0   \n",
       "AAACATTGGTTTCT-1                              0.0   \n",
       "\n",
       "                  RP11-34P13.9 (ENSG00000241599)  \\\n",
       "0                                                  \n",
       "AAACATACCAGAGG-1                             0.0   \n",
       "AAACATTGAAAGCA-1                             0.0   \n",
       "AAACATTGAAGTGA-1                             0.0   \n",
       "AAACATTGGAGGTG-1                             0.0   \n",
       "AAACATTGGTTTCT-1                             0.0   \n",
       "\n",
       "                  FO538757.3 (ENSG00000279928)  FO538757.2 (ENSG00000279457)  \\\n",
       "0                                                                              \n",
       "AAACATACCAGAGG-1                           0.0                           1.0   \n",
       "AAACATTGAAAGCA-1                           0.0                           0.0   \n",
       "AAACATTGAAGTGA-1                           0.0                           0.0   \n",
       "AAACATTGGAGGTG-1                           0.0                           0.0   \n",
       "AAACATTGGTTTCT-1                           0.0                           0.0   \n",
       "\n",
       "                  AP006222.2 (ENSG00000228463)  ...  \\\n",
       "0                                               ...   \n",
       "AAACATACCAGAGG-1                           0.0  ...   \n",
       "AAACATTGAAAGCA-1                           0.0  ...   \n",
       "AAACATTGAAGTGA-1                           0.0  ...   \n",
       "AAACATTGGAGGTG-1                           0.0  ...   \n",
       "AAACATTGGTTTCT-1                           0.0  ...   \n",
       "\n",
       "                  AC007325.2 (ENSG00000277196)  BX072566.1 (ENSG00000277630)  \\\n",
       "0                                                                              \n",
       "AAACATACCAGAGG-1                           0.0                           0.0   \n",
       "AAACATTGAAAGCA-1                           0.0                           0.0   \n",
       "AAACATTGAAGTGA-1                           0.0                           0.0   \n",
       "AAACATTGGAGGTG-1                           0.0                           0.0   \n",
       "AAACATTGGTTTCT-1                           0.0                           0.0   \n",
       "\n",
       "                  AL354822.1 (ENSG00000278384)  AC023491.2 (ENSG00000278633)  \\\n",
       "0                                                                              \n",
       "AAACATACCAGAGG-1                           0.0                           0.0   \n",
       "AAACATTGAAAGCA-1                           0.0                           0.0   \n",
       "AAACATTGAAGTGA-1                           0.0                           0.0   \n",
       "AAACATTGGAGGTG-1                           0.0                           0.0   \n",
       "AAACATTGGTTTCT-1                           0.0                           0.0   \n",
       "\n",
       "                  AC004556.1 (ENSG00000276345)  AC233755.2 (ENSG00000277856)  \\\n",
       "0                                                                              \n",
       "AAACATACCAGAGG-1                           0.0                           0.0   \n",
       "AAACATTGAAAGCA-1                           0.0                           0.0   \n",
       "AAACATTGAAGTGA-1                           0.0                           0.0   \n",
       "AAACATTGGAGGTG-1                           0.0                           0.0   \n",
       "AAACATTGGTTTCT-1                           0.0                           0.0   \n",
       "\n",
       "                  AC233755.1 (ENSG00000275063)  AC240274.1 (ENSG00000271254)  \\\n",
       "0                                                                              \n",
       "AAACATACCAGAGG-1                           0.0                           0.0   \n",
       "AAACATTGAAAGCA-1                           0.0                           0.0   \n",
       "AAACATTGAAGTGA-1                           0.0                           0.0   \n",
       "AAACATTGGAGGTG-1                           0.0                           0.0   \n",
       "AAACATTGGTTTCT-1                           0.0                           0.0   \n",
       "\n",
       "                  AC213203.1 (ENSG00000277475)  FAM231B (ENSG00000268674)  \n",
       "0                                                                          \n",
       "AAACATACCAGAGG-1                           0.0                        0.0  \n",
       "AAACATTGAAAGCA-1                           0.0                        0.0  \n",
       "AAACATTGAAGTGA-1                           0.0                        0.0  \n",
       "AAACATTGGAGGTG-1                           0.0                        0.0  \n",
       "AAACATTGGTTTCT-1                           0.0                        0.0  \n",
       "\n",
       "[5 rows x 33694 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sparse=True\n",
    "T1 = scprep.io.load_10X(os.path.join(download_path, \"scRNAseq\", \"T0_1A\"), sparse=sparse, gene_labels='both')\n",
    "T2 = scprep.io.load_10X(os.path.join(download_path, \"scRNAseq\", \"T2_3B\"), sparse=sparse, gene_labels='both')\n",
    "T3 = scprep.io.load_10X(os.path.join(download_path, \"scRNAseq\", \"T4_5C\"), sparse=sparse, gene_labels='both')\n",
    "T4 = scprep.io.load_10X(os.path.join(download_path, \"scRNAseq\", \"T6_7D\"), sparse=sparse, gene_labels='both')\n",
    "T5 = scprep.io.load_10X(os.path.join(download_path, \"scRNAseq\", \"T8_9E\"), sparse=sparse, gene_labels='both')\n",
    "T1.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 3. Library size filtering **\n",
    "\n",
    "We filter out cells that have either very large or very small library sizes. For this data set, library size correlates somewhat with sample and so we filter on a per-sample basis. In this case, we eliminate the top and bottom 20% of cells for each sample. Similar results are obtained with simpler, less conservative filtering."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Axes: xlabel='Library size', ylabel='Number of cells'>"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAvdElEQVR4nO3de1hVZaLH8d8CFPDCNroBSjGJmVSmFmLlLa3JMrNSO6VmnJrR8dmVOmOadUiw5miDY1puS8fGW5lNRFNoNyvRKFN8RqcSTbuA3BwzhT2YoOI+f/i4zyC3zWLD3iy+n+fxeeRd71784DnHfvOu9a5luFwulwAAANDiBfg6AAAAALyDYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEUE+TpAS3D69GkVFRWpY8eOMgzD13EAAEAr4nK59O9//1tRUVEKCKh7TY5i54GioiJFR0f7OgYAAGjF8vPz1aVLlzrnUOw80LFjR0lnfqFhYWE+TgOgimPHpKioM38vKpLat/dtHgDwMqfTqejoaHcfqQvFzgNnL7+GhYVR7AB/Exj4/38PC6PYAbAsT24HY/MEAACARVDsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgExQ4AAMAiKHYAAAAWQbEDAACwCIodAACARVDsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgExQ4AAMAiKHYAAAAWEeTrAACAliPmiQ3VxnLnDfdBEgA1YcUOAADAIih2AAAAFkGxAwAAsAjusQMAP2b2nrZzP8d9cEDrQLEDAEhiYwRgBRQ7AGgFKG1A68A9dgAAABbBih0AtHA1rcYBaJ0odgDQSlEIAeuh2AEAakX5A1oWih0A+AgbGgB4G8UOAFoYVtEA1IZiBwB+pCWWNh6GDPgPih0AwKs8KaeUP6Bp8Bw7AAAAi6DYAQAAWATFDgAAwCK4xw4AvIANBAD8ASt2AAAAFtEqVuwqKio0efJkffzxxyopKVFcXJyef/55XX/99b6OBqAFMvtIkpb4KJOmwsOZgabRKlbsTp06pZiYGGVlZamkpERTp07ViBEjVFZW5utoAAAAXmO4XC6Xr0P4QlRUlDIyMnTttdfWO9fpdMpms6m0tFRhYWHNkA6Ax44dkzp0OPP3sjKpffsm/5asvDUNT1bsuJcRrVFDeohfrtiVlZVp9uzZGjZsmMLDw2UYhlauXFnj3IqKCs2cOVNRUVEKDQ1VQkKCNm7cWOf59+/fryNHjig2NrYJ0gMAAPiGXxa7w4cPa86cOdqzZ4+uueaaOucmJiZqwYIFGjdunBYtWqTAwEDdfvvtysrKqnH+8ePHNX78eM2aNUs2m60p4gMAAPiEX26eiIyMVHFxsSIiIrRjxw7Fx8fXOG/79u1at26dUlNTNX36dEnShAkTdNVVV2nGjBn64osvqsw/efKkxowZo9jYWD399NNN/nMAAAA0J79csQsODlZERES989LS0hQYGKiJEye6x0JCQvTwww9r69atys/Pd4+fPn1aDzzwgAzD0KpVq2QYRpNkBwAA8BW/XLHz1M6dO3X55ZdXu5Gwb9++kqRdu3YpOjpakjRp0iQVFxfrww8/VFBQi/6xATQzNksAaCladMMpLi5WZGRktfGzY0VFRZKkvLw8LV++XCEhIbrgggvc895//30NGDCg2ucrKipUUVHh/trpdHo7OgCgHhRqoOFadLE7fvy4goODq42HhIS4j0vSpZdeqoY81WXu3LlKSUnxTkgAAIBm4pf32HkqNDS0ysraWeXl5e7jZsyaNUulpaXuP/95rx4AAIC/atErdpGRkSosLKw2XlxcLOnMQ4jNCA4OrnElEAAAwJ+16BW7Xr16ad++fdXugdu2bZv7OAAAQGvRoovd6NGjVVlZqWXLlrnHKioqtGLFCiUkJLh3xAIAALQGfnspdvHixSopKXHvbM3IyFBBQYEk6dFHH5XNZlNCQoLGjBmjWbNm6dChQ4qNjdWqVauUm5urV155xZfxAQAAmp3fFrv58+crLy/P/XV6errS09MlSePHj3e/Dmz16tVKSkrSmjVrdPToUfXs2VPr16/XwIEDfZIbQMvCS+UBWInfFrvc3FyP5oWEhCg1NVWpqalez+BwOORwOFRZWen1cwMAAHhbi77HrqnZ7Xbl5OQoOzvb11EAAADq5bcrdgCA1oU3TQCNx4odAACARVDsAAAALIJiBwAAYBEUOwAAAItg8wQAy+IZdQBaG1bsAAAALIJiVweHw6G4uDjFx8f7OgoAAEC9uBRbB7vdLrvdLqfT6X6FGQBr41lqAFoyih0Ay+iR9IGOtw3xdQwA8BkuxQIAAFgEK3YAWg0uswKwOlbsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgEmyfq4HA45HA4VFlZ6esoAIAa1LQhhlfHoTVjxa4OdrtdOTk5ys7O9nUUAACAelHsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgExQ4AAMAieNwJgBbp7GMuQk+Ua4+PswCAv6DYAQAs5dxn2/FcO7QmXIoFAACwCFbsAACtHm+wgFVQ7OrAK8UAwJpqKnKAFXAptg68UgwAALQkFDsAAACLoNgBAABYBMUOAADAItg8AcDvcaM7AHiGFTsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEXwuBMAPsWjTADAeyh2dXA4HHI4HKqsrPR1FMDv1FTIcucN90ESAMBZXIqtg91uV05OjrKzs30dBQAAoF6s2AEALI3L/WhNKHYAmgyXawGgeXEpFgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCJ48wSAZsXrndBSnPt/q7w1BS0BK3YAAAAWwYodAI+w0gYA/o8VOwAAAIug2NXB4XAoLi5O8fHxvo4CAABQLy7F1sFut8tut8vpdMpms/k6DlAvTy6XcgM4AFgXK3YAAAAWQbEDAACwCIodAACARXCPHQCv4ZEoAOBbrNgBAABYBMUOAADAIkwVu6NHjyonJ0cVFRVVxlesWKGRI0dq7Nix2r59u1cCAgAAwDOm7rF78skn9eqrr+rQoUPusRdffFFTp06Vy+WSJP3973/Xjh07FBcX552kQCvDC8gBAA1lasXu888/19ChQxUaGuoemz9/vjp37qwtW7bob3/7myRpwYIF3kkJAACAeplasSssLNTQoUPdX+fk5Cg/P1/PPfec+vfvL0l68803tWXLFu+kBAAAQL1MrdgdP35cISEh7q8///xzGYahm2++2T3WtWtXFRYWNj4hAAAAPGKq2HXu3Fl79+51f/3hhx8qLCxM11xzjXvs6NGjVS7VAgAAoGmZuhR70003adWqVVq8eLFCQkL07rvvatSoUQoI+P+e+P333ys6OtprQQEAAFA3Uyt2s2bNUocOHTRlyhRNnDhRISEhSk5Odh93Op3KysrSDTfc4K2cAAAAqIepFbtf/epX2r17t9LS0iRJd955py655BL38e+++06TJk3S2LFjvZMSQJPi0SoAYA2m3xUbERGhRx55pMZjffr0UZ8+fUyHAgAAQMPxSjEAAACL8GjFbs6cOaZObhiGkpKSTH0WAAAADeNRsfvPjREN0dKLncPhkMPhUGVlpa+jwOLOvcfN1/wtDwDAMx4Vu02bNjV1Dr9kt9tlt9vldDpls9l8HQcAAKBOHhW7QYMGNXUOAAAANJLpXbEAAKCqmm5j4PFBaE7sigUAALAIj1bsAgICZBhGg09uGIZOnTrV4M8BTc3X/6uazQkAgKbgUbEbOHCgqWIHAACA5uNRscvMzGziGAAAAGgsNk8ArQyXgQHAuhq9eeLYsWPauXOnPvvsM2/kAQAAgEmmV+wKCgo0ZcoUZWRkqLKysspGiaysLE2cOFFLlizR4MGDvZUVaNV8veEDaO1Y7UZLYGrFrri4WAkJCXrnnXd0xx136Prrr5fL5XIfT0hI0KFDh/TGG294LSgAAADqZqrYpaSk6NChQ9q4caPS09N1yy23VDnepk0bDRgwQJ9//rlXQgIAAKB+pi7Fvvfee7rzzjt100031Trnkksu4b47tErnXq7hcikAoLmYWrH717/+pW7dutU5p02bNjp27JipUAAAAGg4U8UuPDxc+fn5dc7Zt2+fIiIiTIUCAABAw5m6FHvjjTfq3Xff1cGDB2ssb/v379cHH3yg8ePHNzog4M882SXXlDvp2KUHAPhPplbsHn/8cZWXl2vQoEF6//339csvv0g680y7999/XyNGjFBAQID+8Ic/eDUsAAAAamdqxS4hIUFLly7V5MmTdccdd7jHw8LCzpw0KEh//etfdeWVV3onJQAAAOpl+gHFDz30kAYMGKAlS5boyy+/1M8//yybzaZ+/frpkUceUffu3b2ZEwAAAPVo1Ltiu3Xrpueff95bWQAAANAIjX5XLAAAAPyDqWL35ptvasiQISoqKqrxeGFhoYYOHar09PRGhQMAAIDnTBW75cuXq6SkRFFRUTUe79y5s0pLS7V8+fJGhQMAAIDnTBW7r7/+Wtddd12dc+Lj4/XVV1+ZCgUAAICGM1Xsjhw5oosuuqjOOeeff74OHz5sKhQAAAAaztSu2AsuuED79++vc87+/fvVqVMnM6cH/EJNb3XInTfcB0kAAPCMqRW7s68U27t3b43H9+zZo3feeUcDBgxoVDgAAAB4zlSxmz59uk6dOqX+/fvrhRde0L59+3Ts2DHt27dPixYt0oABA1RZWanp06d7Oy8AAABqYarYxcfHa8mSJXI6nZo2bZp69OihsLAw9ejRQ7///e/ldDr10ksvKSEhwdt5m5XD4VBcXJzi4+N9HQUAAKBept888dvf/lb9+/fXkiVLtG3bNpWUlKhTp07q16+fJk+erB49engzp0/Y7XbZ7XY5nU7ZbDZfxwEAAKhTo14p1qNHD7344oveygIAAIBG4JViAAAAFtGoFTvASmp6vAkAAC0JK3YAAAAWQbEDAACwCIodAACARXCPHdAA3IcHoKHO/XeDVxOiKXm0YhceHq4//elP7q/nzJmjLVu2NFkoAAAANJxHxa6kpETl5eXur5OTk5WZmdlUmQAAAGCCR8Xu4osvVkFBQVNnAQAAQCN4dI9dv379tGbNGgUGBioyMlKSPFqxMwxDSUlJjQoIAAAAz3hU7FJTU7Vv3z4tXbrUPZaZmVlvuaPYAQAANB+Pil1sbKy+/vpr/fjjjyosLNTgwYOVmJioBx98sKnzAQAAwEMeP+4kICBAXbt2VdeuXSVJMTExGjRoUJMFAwAAQMOYeo7d6dOnvZ0DAAAAjdToBxQXFBRo586dKikpkc1mU58+fdSlSxdvZAMAAEADmC52eXl5mjRpkjZu3Fjt2C233KKXX35ZMTExjckGAACABjBV7A4ePKj+/fursLBQMTExGjhwoCIjI1VcXKzPPvtMH330kfr3768dO3YoIiLC25kBAABQA1PF7plnnlFhYaGee+45/f73v1dgYKD7WGVlpZ5//nnNmDFDzz77rBYvXuy1sAAAAKidR2+eONeGDRv061//Wo8//niVUidJgYGBmj59un79619r/fr1XgkJAACA+pkqdgcPHtS1115b55xrr71WBw8eNBUKAAAADWeq2NlsNuXl5dU558CBA7LZbKZCAQAAoOFMFbv+/fsrLS1NX3zxRY3Ht23bpjfffFP9+/dvVDgAAAB4ztTmiaeeekobNmzQoEGDdN999+mmm25SZGSkDh48qMzMTL3++usKCAjQk08+6e28AAAAqIWpYtenTx+lpaXpwQcf1Guvvaa1a9e6j7lcLoWHh+uvf/1rvffhAQDQ2sQ8saHaWO684T5IAisy/YDiO+64QwcOHNA777yjf/zjHyotLZXNZlPv3r111113qX379t7MCQAAgHo06pVi7du319ixYzV27Fhv5QEAAIBJpjZPAAAAwP9Q7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIkwVuyFDhigpKcnbWQAAANAIpordl19+qcrKSm9nAQAAQCOYekBxt27dlJ+f7+0sQJOp6RU+AABYjakVu9/85jfasGGDDhw44O08AAAAMMnUit2IESO0ceNG3XjjjZo5c6bi4+MVEREhwzCqzb3kkksaHRIAAAD1M1XsLrvsMhmGIZfLpSlTptQ6zzAMnTp1ynQ4AAAAeM5UsZswYUKNq3P+7KWXXtJf/vIXff3113rqqaeUnJzs60gAAABeZarYrVy50ssxml5kZKSSk5O1du1aX0cBAABoEqaKXUt01113SZLee+893wYBAABoIo1+88TevXv19ttva82aNd7II0kqKyvT7NmzNWzYMIWHh8swjFpXCSsqKjRz5kxFRUUpNDRUCQkJ2rhxo9eyAAAAtBSmi92uXbt03XXX6corr9To0aOVmJjoPrZ582a1a9dOGRkZps59+PBhzZkzR3v27NE111xT59zExEQtWLBA48aN06JFixQYGKjbb79dWVlZpr43AABAS2Wq2O3bt0+DBw/Wt99+qylTpui2226rcnzgwIEKDw9XWlqaqVCRkZEqLi5WXl6eUlNTa523fft2rVu3TnPnzlVqaqomTpyoTz/9VJdeeqlmzJhh6nsDAAC0VKaKXUpKik6cOKFt27ZpwYIFio+Pr3LcMAxdf/31ys7ONhUqODhYERER9c5LS0tTYGCgJk6c6B4LCQnRww8/rK1bt/J2DAAA0KqYKnaffPKJ7rnnHsXFxdU6Jzo6WkVFRaaDeWLnzp26/PLLFRYWVmW8b9++ks5cLj7r1KlTKi8vV2VlZZW/AwAAWIWpYnf06FF16dKlzjkul0snTpwwFcpTxcXFioyMrDZ+duw/i+Wzzz6r0NBQLV++XH/84x8VGhpa64aPiooKOZ3OKn8AAAD8nanHnVx88cX67rvv6pyze/duRUdHmwrlqePHjys4OLjaeEhIiPv4WcnJyR4/lHju3LlKSUnxSkYAAOoT88SGKl/nzhvuoyRo6Uyt2A0ZMkQZGRn69ttvazyenZ2tTz75RLfeemujwtUnNDRUFRUV1cbLy8vdx82YNWuWSktL3X+4Vw8AALQEpordrFmzFBQUpIEDB+qll15yX/LcvXu3XnrpJY0YMUIdO3bU9OnTvRr2XGd3z57r7FhUVJSp8wYHByssLKzKHwAAAH9n6lJs9+7d9dZbb+n+++/XI488IunMPXU9e/aUy+VSp06dlJ6erksuucSrYc/Vq1cvbdq0SU6ns0r52rZtm/s4AABAa2H6AcXDhg3Tjz/+qAULFujee+/VzTffrHvuuUepqan67rvvNGTIEG/mrNHo0aNVWVmpZcuWuccqKiq0YsUKJSQkNPk9fgAAAP6kUe+K7dSpk6ZMmaIpU6Z4K4/b4sWLVVJS4r7Mm5GRoYKCAknSo48+KpvNpoSEBI0ZM0azZs3SoUOHFBsbq1WrVik3N1evvPKK1zMBAAD4s0YVu6Y0f/585eXlub9OT09Xenq6JGn8+PGy2WySpNWrVyspKUlr1qzR0aNH1bNnT61fv14DBw70SW4AAABfMX0pVpJee+01DR06VOHh4QoKClJ4eLiGDh2q1157rdHBcnNz5XK5avwTExPjnhcSEqLU1FQVFxervLxc27dvb/LduAAAAP7I1IrdyZMnNXr0aK1fv14ul0uBgYG68MILdfjwYW3atEmZmZn629/+prS0NLVp08bbmZuNw+GQw+HgDRUtxLnPgQIAK6vp3zyefwdTK3Zz585VRkaGEhIStGnTJpWXl7tXzD799FP17dtX69ev13PPPeftvM3KbrcrJyfH9DtvAQAAmpOpYrd69WrFxsYqMzNTgwYNUmBgoCQpMDBQgwcPVmZmpi677DKtXLnSm1kBAABQB1PFrqCgQCNHjlTbtm1rPB4cHKyRI0eqsLCwUeEAAADgOVPFLioqSidPnqxzzsmTJ02/+QEAAAANZ6rYjR07VmlpaXI6nTUeLykpUVpamsaNG9eocAAAAPCcqWL39NNP67rrrlPfvn21du1aFRQU6OTJkyooKNBrr72mfv36qW/fvkpKSvJ2XgAAANTCo8edBAQEyDCMauMul0sPPPBAjeP79+9XaGioTp061fiUAAAAqJdHxW7gwIE1Fjur4zl2AABf8NYz6njWXevjUbHLzMxs4hj+yW63y263y+l0ul9hBgAA4K8a9UoxAAAA+A+KHQAAgEWYelfsWRkZGdq1a5d7V+y5DMPQK6+80phvAQAAAA+ZKnZ5eXkaMWKEdu/eLZfLVes8ih0AAEDzMVXsHnvsMX3zzTd66KGHNGHCBHXu3FlBQY1a/AMAAEAjmWpjn376qW699VYtX77c23kAAABgkqnNE23atNHVV1/t7SwAAABoBFPF7sYbb9Q333zj7SwAAABoBFPFbs6cOdqyZYvWrVvn7Tx+xeFwKC4uTvHx8b6OAgAAUC9T99j17t1bn3zyiYYPH66lS5eqT58+Nb6ZwTAMJSUlNTqkr/DmCQAA0JKYKnalpaV68skndeTIEW3evFmbN2+ucV5LL3YAAAAtialiN23aNG3atEk333yzHnjgAUVFRfG4EwAAAB8z1cbWr1+vG264QR999JG38wAAAMAkU5snjh8/rhtuuMHbWQAAANAIpopd79699cMPP3g7CwAAABrBVLFLSkpSRkaGsrKyvJ0HAAAAJpm6x664uFh33HGHhgwZorFjx+raa6+t9XEgEyZMaFRAAAAAeMZUsUtMTJRhGHK5XFq9erVWr14twzCqzHG5XDIMg2IHAADQTEwVuxUrVng7BwAAABrJVLF78MEHvZ3DLzkcDjkcDlVWVvo6CgAAQL1MbZ5oLex2u3JycpSdne3rKAAAAPWi2AEAAFiEqUuxl112mUfzDMPQ999/b+ZbAAAAoIFMFbvTp09X2wUrSSUlJSotLZUkRUVFqU2bNo1LBwAAAI+ZKna5ubm1Hvvuu+/02GOP6dixY/rwww/N5gIAAEADef0eu9jYWKWnp6uwsFApKSnePj0AAABq0SSbJ0JCQnTLLbfo9ddfb4rTAwAAoAZNtis2KChIBw8ebKrTAwAA4BxNUuwOHz6st99+W9HR0U1xegAAANTA1OaJOXPm1Dh+6tQp5efn65133lFpaanmzp3bqHAAAADwnKlil5ycXOfxsLAw/c///I9mzJhh5vQAAAAwwVSx27RpU43jAQEBOu+883TFFVcoKMjUqQEAAGCSqfY1aNAgb+fwSw6HQw6HQ5WVlb6OAgAAUC/eFVsHu92unJwcZWdn+zoKAABAvTxesTt9+rSpbxAQQHcEAABoDh4XOzPvfTUMQ6dOnWrw5wAAANBwHhe76OhoGYbh0dyysjL9/PPPpkMBAACg4Twudrm5ufXOOXnypF588UX98Y9/lCTFxMSYzQUAAIAG8toNcG+++aZ69Oihxx9/XC6XS3/605+0Z88eb50eAAAA9Wj0w+a++OILTZ8+Xdu2bVNQUJAee+wxPf300zrvvPO8kQ8AAAAeMl3svv/+e82cOVNvv/22XC6XRo8erblz56pr167ezAcAAAAPNbjYHTlyRCkpKVq6dKlOnDih66+/Xn/+85/Vr1+/psgHAAAAD3lc7E6cOKGFCxdq3rx5KikpUdeuXTVv3jyNGjWqKfMBAADAQx4Xu+7du+vAgQMKDw/XwoULZbfbFRgY2JTZAAAA0AAeF7u8vDwZhiGXy6X58+dr/vz59X7GMAzl5eU1KiAAAAA806B77Fwul44cOaIjR440VR4AAACY1OTvigUAAEDz8NoDigEAAOBbjX5AsZU5HA45HA5VVlb6OgoAoJWLeWKDryOgBWDFrg52u105OTnKzs72dRQAAIB6UewAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DY1cHhcCguLk7x8fG+jgIAAFAvil0d7Ha7cnJylJ2d7esoAAAA9aLYAQAAWATFDgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCIodgAAABZBsQMAALAIih0AAIBFUOwAAAAsgmIHAABgERQ7AAAAi6DYAQAAWATFDgAAwCIodgAAABYR5OsA/szhcMjhcKiystLXUQAAaDYxT2yo8nXuvOE+SoKGYsWuDna7XTk5OcrOzvZ1FAAAgHpR7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEVQ7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEVQ7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEVQ7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEVQ7AAAACyCYgcAAGARFDsAAACLoNgBAABYBMUOAADAIih2AAAAFkGxAwAAsIhWU+x++uknDR8+XO3bt1f37t31ySef+DoSAACAVwX5OkBzsdvtioiI0E8//aSPP/5Y9957r/bv36/w8HBfRwMAAPCKVrFiV1ZWpr///e9KSUlRu3btdOedd+rqq6/WO++84+toAAAAXuOXxa6srEyzZ8/WsGHDFB4eLsMwtHLlyhrnVlRUaObMmYqKilJoaKgSEhK0cePGKnP279+vDh06qEuXLu6xq6++Wrt3727KHwMAAKBZ+WWxO3z4sObMmaM9e/bommuuqXNuYmKiFixYoHHjxmnRokUKDAzU7bffrqysLPecsrIyhYWFVflcWFiYysrKmiQ/AACAL/jlPXaRkZEqLi5WRESEduzYofj4+Brnbd++XevWrVNqaqqmT58uSZowYYKuuuoqzZgxQ1988YUkqUOHDnI6nVU+63Q61aFDh6b9QQAAAJqRX67YBQcHKyIiot55aWlpCgwM1MSJE91jISEhevjhh7V161bl5+dLkrp166aysjIVFha6533zzTe68sorvR8eAADAR/yy2Hlq586duvzyy6tdZu3bt68kadeuXZLOrNiNHDlSs2fP1vHjx7V+/Xp99dVXGjlyZHNHBgAAaDJ+eSnWU8XFxYqMjKw2fnasqKjIPbZkyRI9+OCDOv/889WlSxe98cYbtT7qpKKiQhUVFe6vz72MCwAA4I9adLE7fvy4goODq42HhIS4j5914YUX6r333vPovHPnzlVKSop3QgIA0ExintjQ4Dm584Z75byenMcsM5lbqxZ9KTY0NLTKytpZ5eXl7uNmzJo1S6Wlpe4/Z+/VAwAA8GctesUuMjKyyoaIs4qLiyVJUVFRps4bHBxc40ogAACAP2vRK3a9evXSvn37qt0Dt23bNvdxAACA1qJFF7vRo0ersrJSy5Ytc49VVFRoxYoVSkhIUHR0tA/TAQAANC+/vRS7ePFilZSUuHe2ZmRkqKCgQJL06KOPymazKSEhQWPGjNGsWbN06NAhxcbGatWqVcrNzdUrr7ziy/gAAADNzm+L3fz585WXl+f+Oj09Xenp6ZKk8ePHy2azSZJWr16tpKQkrVmzRkePHlXPnj21fv16DRw40Ce5AQAAfMVvi11ubq5H80JCQpSamqrU1FSvZ3A4HHI4HKqsrPT6uQEAALytRd9j19TsdrtycnKUnZ3t6ygAAAD1otgBAABYBMUOAADAIih2AAAAFkGxAwAAsAiKHQAAgEVQ7AAAACyCYgcAAGARFLs6OBwOxcXFKT4+3tdRAAAA6uW3b57wB3a7XXa7XaWlperUqZOcTqevI6EOpyt+8XUE+EDliXKd/f/MyopfdNp12qd5gJampv+2mfn3tCn/G3luntb23+OzP6/L5ap3ruHyZFYrV1BQoOjoaF/HAAAArVh+fr66dOlS5xyKnQdOnz6toqIidezYUYZh+DoOALR4TqdT0dHRys/PV1hYmK/jAH7N5XLp3//+t6KiohQQUPdddBQ7AECzczqdstlsKi0tpdgBXsTmCQAAAIug2AEAAFgExQ4A0OyCg4M1e/ZsBQcH+zoKYCncYwcAAGARrNgBAABYBMUOAODXtm7dqoCAAD377LO+jgL4PYodAMBvnT59WtOmTePVjoCHeKUYAMBvLVu2TAkJCSotLfV1FKBFYMUOANBoZWVlmj17toYNG6bw8HAZhqGVK1fWOLeiokIzZ85UVFSUQkNDlZCQoI0bN1ab9/PPP2vhwoVKSUlp4vSAdVDsAACNdvjwYc2ZM0d79uzRNddcU+fcxMRELViwQOPGjdOiRYsUGBio22+/XVlZWVXmPfXUU5o6dao6derUhMkBa+FSLACg0SIjI1VcXKyIiAjt2LGj1nvitm/frnXr1ik1NVXTp0+XJE2YMEFXXXWVZsyYoS+++EKStHPnTmVnZ8vhcDTbzwBYAcUOANBowcHBioiIqHdeWlqaAgMDNXHiRPdYSEiIHn74YT355JPKz89XdHS0Nm/erG+//VadO3eWJJWWliooKEjff/+9VqxY0WQ/B9DSUewAAM1m586duvzyyxUWFlZlvG/fvpKkXbt2KTo6WhMnTtR9993nPj5lyhT96le/0hNPPNGseYGWhmIHAGg2xcXFioyMrDZ+dqyoqEiS1K5dO7Vr1859PDQ0VB06dOB+O6AeFDsAQLM5fvx4je+HDQkJcR+vSW07bAFUxa5YAECzCQ0NVUVFRbXx8vJy93EA5lHsAADN5uzu2XOdHYuKimruSIClUOwAAM2mV69e2rdvn5xOZ5Xxbdu2uY8DMI9iBwBoNqNHj1ZlZaWWLVvmHquoqNCKFSuUkJCg6OhoH6YDWj42TwAAvGLx4sUqKSlx72zNyMhQQUGBJOnRRx+VzWZTQkKCxowZo1mzZunQoUOKjY3VqlWrlJubq1deecWX8QFLMFwul8vXIQAALV9MTIzy8vJqPPbjjz8qJiZG0pmNEklJSXr11Vd19OhR9ezZU88884xuvfXWZkwLWBPFDgAAwCK4xw4AAMAiKHYAAAAWQbEDAACwCIodAACARVDsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgExQ4AAMAiKHYAAAAWQbED0OIYhqHBgwdXGUtOTpZhGMrMzPRJJn9W0+8LgDVR7AD4BcMwZBiGr2MAQIsW5OsAANBQe/bsUbt27Xwdo8Xg9wW0HhQ7AC3OFVdc4esILQq/L6D14FIsgBanvnvGVq1apd69eys0NFQXXXSRHnroIR08eLDavMGDB8swDJ04cUJz5sxR9+7dFRwcrMTERElSaWmpUlNTNWTIEHXp0kVt27bVhRdeqDvvvFNbt26tM9vBgwf1m9/8Rp07d1ZgYKBWrlyp+++/X4ZhaPPmzTV+9q233pJhGHrkkUfq/R2cOHFCL7zwgvr06aPzzjtP7dq1U0xMjEaOHKmPP/64zt9XZmam+9J3bX/OvVdx7969SkxMVHR0tNq2bauLL75YY8eO1bfffltvVgDNhxU7AJby/PPP66OPPtJ//dd/adiwYcrKytKKFSuUmZmpbdu26cILL6z2mVGjRik7O1u33Xab7rrrLl100UWSzlzCfOqppzRw4EANHz5c5513ng4cOKB3331X77//vjIyMjRs2LBq5zty5Ij69eunDh066J577lFAQIAuvvhiTZ48WevWrdOyZcs0aNCgap9bunSpJOl3v/tdvT9nYmKiXn/9dV111VWaMGGCQkNDVVRUpKysLH3wwQe6+eaba/1sTEyMZs+eXW385MmTWrBggcrLy6tcuv3ggw90zz336OTJkxoxYoRiY2NVUFCg9PR0bdiwQZs2bVKfPn3qzQygGbgAwA9Icnn6T5Ik16BBg6qMzZ492yXJ1aZNG9c//vGPKsemTp3qkuR66KGHqowPGjTIJcl19dVXu3766adq36ekpKTG8fz8fFdkZKTriiuuqPXneOCBB1wnT56sdvzKK690BQcHuw4fPlxl/Pvvv3cZhuG64YYbav25/zOXYRiua6+91nXq1Klqx889d02/r5o8+OCDLkmuqVOnuseOHDni6tSpk+v888937d69u8r8r7/+2tW+fXtX79696z03gObBpVgAlvLAAw+od+/eVcaSk5Nls9m0du1aVVRUVPvMM888owsuuKDauM1mq3G8S5cuGj16tPbu3asDBw5UO962bVvNnz9fQUHVL4pMnjxZFRUVWrlyZZXxv/zlL3K5XJo0aVJ9P6IMw5DL5VJwcLACAqr/M37++efXe45zzZkzR6tWrdLIkSP15z//2T2+evVqlZSUKCUlRXFxcVU+c9VVV+m3v/2tdu7cqZycnAZ/TwDex6VYAJZS0yVOm82mXr16afPmzdqzZ4969epV5Xjfvn1rPd/nn3+uRYsWaevWrTp06JBOnDhR5XhhYaEuueSSKmMxMTHuy7nnmjBhgp544gktW7ZMf/jDHySduQS6cuVKnXfeebr33nvr/RnDwsI0YsQIZWRkqFevXho1apQGDBighIQEU7tfX3vtNc2ePVvXXXed1q5dW6Usnr2X8J///KeSk5OrfXbfvn2Szly2Prf4AWh+FDsAlnLxxRfXOB4RESHpzIaI2o6d6+2339bo0aMVEhKiW265RV27dlX79u0VEBCgzMxMbd68ucYVwNrOJ0kdO3bU+PHj9fLLL2vTpk266aab9O677+rgwYOaOnWqQkJCPPkx9cYbb+i5557T2rVr3ffLhYSEaPTo0Zo/f36tv4dzbd68WQ899JAuvfRSrV+/vlox/PnnnyWdWVGsS1lZmUffD0DTotgBsJR//etfNY6f3RVrs9mqHavtwchJSUlq27atduzYoR49elQ5NmnSpFp3t9b3oOXJkyfr5Zdf1tKlS3XTTTe5N01MnDixzs/9p9DQUCUnJys5OVn5+fnasmWLVq5cqVdffVW5ubn67LPP6j3H3r17dffddys0NFTvvfdejWXw7O/rn//8p3r27OlxPgC+wT12ACylprJVWlqqXbt2KSQkpFpBq8t3332nuLi4ap85ffq0srKyTGfs2bOnbrzxRr399tvatm2bPv74Yw0cOLBB2f5TdHS0xo0bpw8//FCxsbHKyspyr7TV5qefftLw4cNVVlamt956q9bLqP369ZMkj4oiAN+j2AGwlDVr1mjnzp1VxpKTk1VaWqr7779fwcHBHp8rJiZG+/fvV1FRkXvM5XIpOTm50ZsFJk+erBMnTmjUqFFyuVwePeLkrJ9++klff/11tfFjx46prKxMQUFBatu2ba2fLy8v15133qkffvhBS5cu1dChQ2ud+9///d/q1KmTUlJStH379mrHT58+zft5AT/CpVgAfuXsw4FrsmTJkno3B9x222268cYbde+99yoyMlJZWVnKyspSTEyM5s2b16As06ZN0+9+9zv17t1bo0aNUps2bfT5558rJyfHvXnBrDFjxmjatGkqLCzUBRdcoHvuucfjzxYWFqp37966+uqr1bNnT0VHR8vpdGr9+vU6ePCgHnvsMXXs2LHWz7/wwgv68ssvddlllykvL6/GTRGJiYmKiYnR+eefr7S0NN19993q16+fhg4dqiuvvFKGYSg/P19bt27Vzz//rPLycjO/BgBeRrED4FdWrVpV67GFCxfWW+ymTZumu+++WwsXLtQbb7yhDh06KDExUf/7v/9b607V2kyaNEnBwcFauHChVq1apdDQUA0YMEArVqzQW2+91ahi17ZtW40bN04LFy5UYmJig1cSU1JSlJmZqU2bNunw4cMKDw9X9+7dNW/ePN133311fv6XX36RJP3www9KSUmpcc7gwYMVExMjSRo6dKi++uorzZ8/Xx9++KE+++wztW3bVlFRURoyZIhGjRrlcXYATctwuVwuX4cAgNZo8ODB2rJli7799lt169bN13EAWAD32AGAD2zfvl2bN2/WrbfeSqkD4DVcigWAZvTSSy+psLBQK1asUEBAQK2XQgHADC7FAkAziomJUUFBgS677DIlJydr7Nixvo4EwEIodgAAABbBPXYAAAAWQbEDAACwCIodAACARVDsAAAALIJiBwAAYBEUOwAAAIug2AEAAFgExQ4AAMAiKHYAAAAW8X/Quwkz8+SlhgAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "scprep.plot.plot_library_size(T1, percentile=20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "filtered_batches = []\n",
    "for batch in [T1, T2, T3, T4, T5]:\n",
    "    batch = scprep.filter.filter_library_size(batch, percentile=20, keep_cells='above')\n",
    "    batch = scprep.filter.filter_library_size(batch, percentile=75, keep_cells='below')\n",
    "    filtered_batches.append(batch)\n",
    "del T1, T2, T3, T4, T5 # removes objects from memory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 4. Merge all datasets and create a vector representing the time point of each sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A1BG (ENSG00000121410)</th>\n",
       "      <th>A1BG-AS1 (ENSG00000268895)</th>\n",
       "      <th>A1CF (ENSG00000148584)</th>\n",
       "      <th>A2M (ENSG00000175899)</th>\n",
       "      <th>A2M-AS1 (ENSG00000245105)</th>\n",
       "      <th>A2ML1 (ENSG00000166535)</th>\n",
       "      <th>A2ML1-AS1 (ENSG00000256661)</th>\n",
       "      <th>A2ML1-AS2 (ENSG00000256904)</th>\n",
       "      <th>A3GALT2 (ENSG00000184389)</th>\n",
       "      <th>A4GALT (ENSG00000128274)</th>\n",
       "      <th>...</th>\n",
       "      <th>ZXDC (ENSG00000070476)</th>\n",
       "      <th>ZYG11A (ENSG00000203995)</th>\n",
       "      <th>ZYG11B (ENSG00000162378)</th>\n",
       "      <th>ZYX (ENSG00000159840)</th>\n",
       "      <th>ZZEF1 (ENSG00000074755)</th>\n",
       "      <th>ZZZ3 (ENSG00000036549)</th>\n",
       "      <th>bP-21264C1.2 (ENSG00000278932)</th>\n",
       "      <th>bP-2171C21.3 (ENSG00000279501)</th>\n",
       "      <th>bP-2189O9.3 (ENSG00000279579)</th>\n",
       "      <th>hsa-mir-1253 (ENSG00000272920)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>AAACATTGAAAGCA-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACCGTGCAGAAA-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACCGTGGAAGGC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACGCACCGGTAT-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACGCACCTATTC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACGCACTCAGAC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGAGACTCTATC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGATCTCTGCTC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGATCTGGTACT-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGATCTTGGTTG-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10 rows × 33694 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                            A1BG (ENSG00000121410)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                     0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                     0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                     0.0   \n",
       "\n",
       "                            A1BG-AS1 (ENSG00000268895)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                         0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                         0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                         0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                         0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                         0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                         0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                         0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                         0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                         0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                         0.0   \n",
       "\n",
       "                            A1CF (ENSG00000148584)  A2M (ENSG00000175899)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                     0.0                    0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0                    0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                     0.0                    0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0                    0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0                    0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                     0.0                    0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                     0.0                    0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0                    0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0                    0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                     0.0                    0.0   \n",
       "\n",
       "                            A2M-AS1 (ENSG00000245105)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                        0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                        0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                        0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                        0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                        0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                        0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                        0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                        0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                        0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                        0.0   \n",
       "\n",
       "                            A2ML1 (ENSG00000166535)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                      0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                      0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                      0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                      0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                      0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                      0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                      0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                      0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                      0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                      0.0   \n",
       "\n",
       "                            A2ML1-AS1 (ENSG00000256661)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                          0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                          0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                          0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                          0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                          0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                          0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                          0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                          0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                          0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                          0.0   \n",
       "\n",
       "                            A2ML1-AS2 (ENSG00000256904)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                          0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                          0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                          0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                          0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                          0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                          0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                          0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                          0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                          0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                          0.0   \n",
       "\n",
       "                            A3GALT2 (ENSG00000184389)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                        0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                        0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                        0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                        0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                        0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                        0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                        0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                        0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                        0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                        0.0   \n",
       "\n",
       "                            A4GALT (ENSG00000128274)  ...  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                       0.0  ...   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                       0.0  ...   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                       0.0  ...   \n",
       "AAACGCACCGGTAT-1_Day 00-03                       0.0  ...   \n",
       "AAACGCACCTATTC-1_Day 00-03                       0.0  ...   \n",
       "AAACGCACTCAGAC-1_Day 00-03                       1.0  ...   \n",
       "AAAGAGACTCTATC-1_Day 00-03                       0.0  ...   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                       0.0  ...   \n",
       "AAAGATCTGGTACT-1_Day 00-03                       0.0  ...   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                       0.0  ...   \n",
       "\n",
       "                            ZXDC (ENSG00000070476)  ZYG11A (ENSG00000203995)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                     0.0                       0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0                       0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                     0.0                       0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0                       0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0                       0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                     0.0                       0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                     0.0                       0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0                       0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0                       0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                     0.0                       0.0   \n",
       "\n",
       "                            ZYG11B (ENSG00000162378)  ZYX (ENSG00000159840)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                       0.0                    0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                       0.0                    0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                       0.0                    0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                       0.0                    0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                       0.0                    1.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                       0.0                    0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                       2.0                    0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                       0.0                    0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                       0.0                    0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                       0.0                    0.0   \n",
       "\n",
       "                            ZZEF1 (ENSG00000074755)  ZZZ3 (ENSG00000036549)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                      0.0                     0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                      0.0                     0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                      0.0                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                      0.0                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                      0.0                     0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                      0.0                     0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                      0.0                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                      0.0                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                      0.0                     0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                      0.0                     0.0   \n",
       "\n",
       "                            bP-21264C1.2 (ENSG00000278932)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                             0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                             0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                             0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                             0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                             0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                             0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                             0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                             0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                             0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                             0.0   \n",
       "\n",
       "                            bP-2171C21.3 (ENSG00000279501)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                             0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                             0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                             0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                             0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                             0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                             0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                             0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                             0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                             0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                             0.0   \n",
       "\n",
       "                            bP-2189O9.3 (ENSG00000279579)  \\\n",
       "AAACATTGAAAGCA-1_Day 00-03                            0.0   \n",
       "AAACCGTGCAGAAA-1_Day 00-03                            0.0   \n",
       "AAACCGTGGAAGGC-1_Day 00-03                            0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                            0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                            0.0   \n",
       "AAACGCACTCAGAC-1_Day 00-03                            0.0   \n",
       "AAAGAGACTCTATC-1_Day 00-03                            0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                            0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                            0.0   \n",
       "AAAGATCTTGGTTG-1_Day 00-03                            0.0   \n",
       "\n",
       "                            hsa-mir-1253 (ENSG00000272920)  \n",
       "AAACATTGAAAGCA-1_Day 00-03                             0.0  \n",
       "AAACCGTGCAGAAA-1_Day 00-03                             0.0  \n",
       "AAACCGTGGAAGGC-1_Day 00-03                             0.0  \n",
       "AAACGCACCGGTAT-1_Day 00-03                             0.0  \n",
       "AAACGCACCTATTC-1_Day 00-03                             0.0  \n",
       "AAACGCACTCAGAC-1_Day 00-03                             0.0  \n",
       "AAAGAGACTCTATC-1_Day 00-03                             0.0  \n",
       "AAAGATCTCTGCTC-1_Day 00-03                             0.0  \n",
       "AAAGATCTGGTACT-1_Day 00-03                             0.0  \n",
       "AAAGATCTTGGTTG-1_Day 00-03                             0.0  \n",
       "\n",
       "[10 rows x 33694 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "EBT_counts, sample_labels = scprep.utils.combine_batches(\n",
    "    filtered_batches, \n",
    "    [\"Day 00-03\", \"Day 06-09\", \"Day 12-15\", \"Day 18-21\", \"Day 24-27\"],\n",
    "    append_to_cell_names=True\n",
    ")\n",
    "del filtered_batches # removes objects from memory\n",
    "EBT_counts.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "<a id='preprocessing'></a>\n",
    "## 2. Preprocessing: Filtering, Normalizing, and Transforming\n",
    "\n",
    "### Filtering\n",
    "\n",
    "We filter the data by: \n",
    "1. Filtering by library size (if we did not do this prior to combining batches)\n",
    "2. Removing genes that are expressed in relatively few cells.\n",
    "3. Removing dead cells\n",
    "\n",
    "We filter dead cells after library size normalization, since library size is not necessarily related to cell state."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** Filtering I: Library size filtering **\n",
    "\n",
    "We did this before, because the library size correlated strongly with our samples. However, if we wanted to do something simplier, we could have run the following here instead:\n",
    "\n",
    "`EBT_counts, sample_labels = scprep.filter.library_size_filter(EBT_counts, sample_labels, cutoff=2000)`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "#### Filtering II: Remove rare genes\n",
    "\n",
    "We eliminate genes that are expressed in 10 cells or fewer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "EBT_counts = scprep.filter.filter_rare_genes(EBT_counts, min_cells=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Normalization\n",
    "\n",
    "To correct for differences in library sizes, we divide each cell by its library size and then rescale by the median library size.\n",
    "\n",
    "In python this is performed using the preprocessing method `library_size_normalize()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "EBT_counts = scprep.normalize.library_size_normalize(EBT_counts)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "jupyter": {
     "outputs_hidden": true
    }
   },
   "source": [
    "#### Filtering III: Dead cell removal\n",
    "\n",
    "Dead cells are likely to have a higher mitochondrial RNA expression level than live cells. Therefore, we remove suspected dead cells by eliminating cells that have the highest mitochondrial RNA expression levels on average.  \n",
    "\n",
    "First let's look at the distribution of mitochontrial genes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<Axes: xlabel='Gene expression', ylabel='Number of cells'>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAABHpUlEQVR4nO3deXxOd/7//+eVPaIJQUlS21hq3yaC2GqpplR1iqmt6HSYsdRW2oaPtmgtRVcfLdMZ+mnVTKWdKm21KKp2/dKq8olpUYkoGkkE2d+/P/xyPi5XwpUrIXE87rdbbnK9z/t6X6/zdqae8z7nOsdhjDECAADALc+rtAsAAABAySDYAQAA2ATBDgAAwCYIdgAAADZBsAMAALAJgh0AAIBNEOwAAABsgmAHAABgEz6lXYAd5eXl6eTJk7rjjjvkcDhKuxwAAHALM8bo/PnzCg8Pl5fXtdfkCHY3wMmTJ1W9evXSLgMAANjIiRMndNddd12zD8HuBrjjjjskXf4LCA4OLuVqypgLF6Tw8Mu/nzwpBQWVbj0AAJRxaWlpql69upUvroVgdwPkn34NDg4m2F3N2/v/fg8OJtgBAOAmdy7v4ssTAAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJnxKuwDcWmo986lL27G5vUqhEgAAcDVW7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAnuY4dCFXTPOgAAUHaxYgcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsIkyF+z27NmjsWPHqnHjxgoKClKNGjX0xz/+UfHx8U79hg8fLofD4fLToEEDlzHz8vL00ksvqXbt2goICFCzZs20cuXKAj//0KFDiomJUfny5RUaGqpHH31UZ86cuSH7CgAAUJLK3H3s5s2bp23btql///5q1qyZTp06pUWLFqlVq1bauXOnmjRpYvX19/fX22+/7fT+kJAQlzGnTZumuXPnasSIEWrdurVWr16tQYMGyeFwaMCAAVa/hIQEderUSSEhIZo9e7bS09O1YMECHThwQLt375afn9+N23EAAIBichhjTGkXcaXt27crMjLSKUQdOXJETZs2Vb9+/fTee+9JurxiFxcXp/T09GuOl5iYqNq1a2vkyJFatGiRJMkYo86dO+vo0aM6duyYvL29JUmjR4/W8uXLdfjwYdWoUUOStGHDBt17771asmSJRo4c6dY+pKWlKSQkRKmpqQoODi7yHJQV7t6g+NjcXu4PeuGCVL785d/T06WgIA8qAwDg9lGUXFHmTsVGR0e7rIzVq1dPjRs31qFDh1z65+bmKi0trdDxVq9erezsbI0ePdpqczgcGjVqlBISErRjxw6r/cMPP9QDDzxghTpJ6t69u+rXr68PPvigOLsFAABww5W5YFcQY4x+/fVXVa5c2an94sWLCg4OVkhIiEJDQzVmzBiXFbx9+/YpKChIDRs2dGqPioqytkuXV/ZOnz6tyMhIl8+Pioqy+gEAAJRVZe4au4KsWLFCiYmJmjlzptUWFhamp556Sq1atVJeXp7WrVunxYsX67vvvtPmzZvl43N515KSklS1alU5HA6nMcPCwiRJJ0+etPpd2X513+TkZGVmZsrf399le2ZmpjIzM63X11pBBAAAuFHKfLA7fPiwxowZo3bt2mnYsGFW+5w5c5z6DRgwQPXr19e0adMUFxdnfSni0qVLBYaxgIAAa/uVf16vb0Hb58yZoxkzZniyewAAACWmTJ+KPXXqlHr16qWQkBDFxcVZX3IozMSJE+Xl5aUNGzZYbYGBgU6rafkyMjKs7Vf+6U7fq8XGxio1NdX6OXHihBt7BwAAULLK7Ipdamqq7r//fqWkpGjr1q0KDw+/7nsCAwNVqVIlJScnW21hYWHatGmTjDFOp2PzT73mj5t/Cja//UpJSUkKDQ0tcLVOurzKV9g2AACAm6VMrthlZGSod+/eio+P19q1a9WoUSO33nf+/HmdPXtWVapUsdpatGihixcvunyjdteuXdZ2SYqIiFCVKlW0d+9el3F3795t9QMAACirylywy83N1SOPPKIdO3Zo1apVateunUufjIwMnT9/3qV91qxZMsYoJibGauvTp498fX21ePFiq80Yo7feeksRERGKjo622vv27au1a9c6nUrduHGj4uPj1b9//5LaRQAAgBuizJ2KffLJJ/XJJ5+od+/eSk5Otm5InG/IkCE6deqUWrZsqYEDB1qPEPviiy/02WefKSYmRn369LH633XXXZowYYLmz5+v7OxstW7dWh9//LG2bt2qFStWOF23N3XqVK1atUpdunTR+PHjlZ6ervnz56tp06Z67LHHbs4EAAAAeKjMPXninnvu0ZYtWwrdboxRSkqKnnjiCe3cuVMnT55Ubm6u6tatq8GDB2vy5Mny9fV1ek9eXp7mzZunJUuWKCkpSfXq1VNsbKwGDx7sMv7Bgwc1adIkffPNN/Lz81OvXr20cOFCVa1a1e194MkT18CTJwAAKJKi5IoyF+zsgGB3DQQ7AACK5JZ+pBgAAAA8Q7ADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmfEq7ANz6aj3zqdPrY3N7lVIlAADc3lixAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACb4FuxkOT6zVYAAHDrYcUOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsIkyF+z27NmjsWPHqnHjxgoKClKNGjX0xz/+UfHx8S59Dx06pJiYGJUvX16hoaF69NFHdebMGZd+eXl5eumll1S7dm0FBASoWbNmWrlyZYGf7+6YAAAAZY1PaRdwtXnz5mnbtm3q37+/mjVrplOnTmnRokVq1aqVdu7cqSZNmkiSEhIS1KlTJ4WEhGj27NlKT0/XggULdODAAe3evVt+fn7WmNOmTdPcuXM1YsQItW7dWqtXr9agQYPkcDg0YMAAq19RxgQAAChrHMYYU9pFXGn79u2KjIx0ClFHjhxR06ZN1a9fP7333nuSpNGjR2v58uU6fPiwatSoIUnasGGD7r33Xi1ZskQjR46UJCUmJqp27doaOXKkFi1aJEkyxqhz5846evSojh07Jm9v7yKNeT1paWkKCQlRamqqgoODS2ZibrBaz3xaYmMdm9ur8I0XLkjly1/+PT1dCgoqsc8FAMCOipIrytyp2OjoaJeVsXr16qlx48Y6dOiQ1fbhhx/qgQcesAKYJHXv3l3169fXBx98YLWtXr1a2dnZGj16tNXmcDg0atQoJSQkaMeOHUUeEwAAoCwqc8GuIMYY/frrr6pcubKky6twp0+fVmRkpEvfqKgo7du3z3q9b98+BQUFqWHDhi798rcXdUwAAICy6JYIditWrFBiYqIeeeQRSVJSUpIkKSwszKVvWFiYkpOTlZmZafWtWrWqHA6HSz9JOnnyZJHHvFpmZqbS0tKcfgAAAG62Mh/sDh8+rDFjxqhdu3YaNmyYJOnSpUuSJH9/f5f+AQEBTn0uXbrkdj93x7zanDlzFBISYv1Ur17d/R0EAAAoIWU62J06dUq9evVSSEiI4uLirC85BAYGSlKBK2gZGRlOfQIDA93u5+6YV4uNjVVqaqr1c+LECfd3EgAAoISUudud5EtNTdX999+vlJQUbd26VeHh4da2/NOl+adPr5SUlKTQ0FBr5S0sLEybNm2SMcbpdGz+e/PHLcqYV/P39y90GwAAwM1SJlfsMjIy1Lt3b8XHx2vt2rVq1KiR0/aIiAhVqVJFe/fudXnv7t271aJFC+t1ixYtdPHiRadv1ErSrl27rO1FHRMAAKAsKnPBLjc3V4888oh27NihVatWqV27dgX269u3r9auXet02nPjxo2Kj49X//79rbY+ffrI19dXixcvttqMMXrrrbcUERGh6OjoIo8JAABQFpW5U7FPPvmkPvnkE/Xu3VvJycnWDYnzDRkyRJI0depUrVq1Sl26dNH48eOVnp6u+fPnq2nTpnrssces/nfddZcmTJig+fPnKzs7W61bt9bHH3+srVu3asWKFdZ1e0UZEwAAoCwqc0+euOeee7Rly5ZCt19Z7sGDBzVp0iR988038vPzU69evbRw4UJVrVrV6T15eXmaN2+elixZoqSkJNWrV0+xsbEaPHiwy/jujnktPHmCJ08AAFBSipIrylywswOCHcEOAICScks/UgwAAACeIdgBAADYhEfB7ty5c/rxxx9dbua7bNky9enTR4MGDdLu3btLpEAAAAC4x6NvxU6dOlXvvfeeTp8+bbW98cYbmjBhgvXlho8//lh79+51uQcdAAAAbgyPVuy2bdumbt26OT1ia8GCBYqIiNDXX3+tDz74QJL08ssvl0yVAAAAuC6PVuwSExPVrVs36/WPP/6oEydOaN68eerQoYMkadWqVfr6669LpkoAAABcl0crdpcuXVJAQID1etu2bXI4HOrevbvVVqdOHSUmJha/QgAAALjFo2AXERGhw4cPW6+/+OILBQcHq3nz5lbbuXPnnE7VAgAA4Mby6FRsly5d9M4772jRokUKCAjQJ598or59+8rL6/9y4k8//aTq1auXWKEAAAC4No9W7GJjY1W+fHmNHz9eI0eOVEBAgJ5//nlre1pamr755htFR0eXVJ0AAAC4Do9W7GrXrq2DBw8qLi5OkvTggw+qRo0a1vb//Oc/+stf/qJBgwaVTJUAAAC4Lo+CnSRVq1ZNY8eOLXBbq1at1KpVK4+LAgAAQNHxSDEAAACbcGvFbubMmR4N7nA4NH36dI/eCwAAgKJxK9hd+cWIoiDYAQAA3DxuBbtNmzbd6DoAAABQTG4Fu86dO9/oOgAAAFBMfHkCAADAJgh2AAAANuHWqVgvLy85HI4iD+5wOJSTk1Pk9wEAAKDo3Ap2nTp18ijYAQAA4OZxK9ht3rz5BpcBAACA4uIaOwAAAJsodrC7cOGC9u3bp61bt5ZEPQAAAPCQx8EuISFBffv2VcWKFRUZGakuXbpY27755hs1atSIU7gAAAA3kUfBLikpSW3atNHq1av1wAMPqF27djLGWNvbtGmj06dP61//+leJFQoAAIBr8yjYzZgxQ6dPn9b69ev10Ucf6d5773Xa7uvrq44dO2rbtm0lUiQAAACuz6Ng99lnn+nBBx90Ov16tRo1aujkyZMeFwYAAICi8SjY/frrr6pXr941+/j6+urChQseFQUAAICi8yjYhYaG6sSJE9fsEx8fr2rVqnlUFAAAAIrOrRsUX619+/b65JNPdOrUqQLD25EjR7Ru3ToNGTKk2AWi5NV65tPSLgEAANwAHq3YTZkyRRkZGercubM+//xzXbx4UdLle9p9/vnn6t27t7y8vPTkk0+WaLEAAAAonEcrdm3atNGSJUs0atQoPfDAA1Z7cHDw5UF9fPSPf/xDjRs3LpkqAQAAcF0eBTtJ+tOf/qSOHTtq8eLF2rlzp3777TeFhISobdu2Gjt2rO6+++6SrBMAAADX4XGwk6R69erplVdeKalaAAAAUAzFflYsAAAAygaPgt2qVavUtWvXQm9AnJiYqG7duumjjz4qVnEAAABwn0fB7u2331ZKSorCw8ML3B4REaHU1FS9/fbbxSoOAAAA7vMo2B04cECRkZHX7NO6dWt9//33HhUFAACAovMo2CUnJ+vOO++8Zp9KlSrp7NmzHhUFAACAovMo2FWuXFlHjhy5Zp8jR46oQoUKngwPAAAAD3gU7PIfKXb48OECtx86dEirV69Wx44di1UcAAAA3OdRsJs8ebJycnLUoUMHvf7664qPj9eFCxcUHx+v1157TR07dlRubq4mT55c0vUCAACgEB7doLh169ZavHixxowZo4kTJ2rixIlO2729vfXmm2+qTZs2JVIkAAAArs/jJ0+MGDFCHTp00OLFi7Vr1y6lpKSoQoUKatu2rUaNGqWGDRuWZJ24hdR65lOXtmNze5VCJQAA3F6K9Uixhg0b6o033iipWgAAAFAMPFIMAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATbgW70NBQvfTSS9brmTNn6uuvv75hRQEAAKDo3Ap2KSkpysjIsF4///zz2rx5842qCQAAAB5wK9hVrVpVCQkJN7oWAAAAFINbT55o27at3n33XXl7eyssLEyS3Fqxczgcmj59erEKBAAAgHscxhhzvU7/+c9/1KdPHx06dKhogzscys3N9bi4W1VaWppCQkKUmpqq4ODg0i7HRUHPcr3RrGfFXrgglS9/+ff0dCko6KbXAgDAraQoucKtU7F169bVgQMHdOTIEWulbvjw4dq0adM1f7766qsiF5+enq7nnntOMTExCg0NlcPh0PLly136DR8+XA6Hw+WnQYMGLn3z8vL00ksvqXbt2goICFCzZs20cuXKAj//0KFDiomJUfny5RUaGqpHH31UZ86cKfJ+AAAA3GxunYqVJC8vL9WpU0d16tSRJNWqVUudO3cu8YLOnj2rmTNnqkaNGmrevPk1T/n6+/vr7bffdmoLCQlx6Tdt2jTNnTtXI0aMUOvWrbV69WoNGjRIDodDAwYMsPolJCSoU6dOCgkJ0ezZs5Wenq4FCxbowIED2r17t/z8/EpsPwEAAEqa28HuSnl5eSVdhyUsLExJSUmqVq2a9u7dq9atWxfa18fHR0OGDLnmeImJiVq4cKHGjBmjRYsWSZL+/Oc/q3PnzpoyZYr69+8vb29vSdLs2bN14cIFffvtt6pRo4YkKSoqSvfee6+WL1+ukSNHltBeAgAAlLxi36A4ISFBa9as0bvvvqtPPvmk2N+e9ff3V7Vq1dzun5ubq7S0tEK3r169WtnZ2Ro9erTV5nA4NGrUKCUkJGjHjh1W+4cffqgHHnjACnWS1L17d9WvX18ffPBBEfcEAADg5vI42B0/flwxMTGqWbOmHnroIQ0fPlx/+MMfVLNmTcXExOjYsWMlWGbBLl68qODgYIWEhCg0NFRjxoxRenq6U599+/YpKChIDRs2dGqPioqytkuXV/ZOnz6tyMhIl8+Jioqy+gEAAJRVHp2KPXXqlDp06KDExETVqlVLnTp1sk6hbt26VV9++aU6dOigvXv3Fmn1rSjCwsL01FNPqVWrVsrLy9O6deu0ePFifffdd9q8ebN8fC7vWlJSkqpWrSqHw+Hyfkk6efKk1e/K9qv7JicnKzMzU/7+/i7bMzMzlZmZab2+1goiAADAjeJRsJs1a5YSExM1b948TZo0ybpGTbp8avSVV17RU089pRdeeMG6rq2kzZkzx+n1gAEDVL9+fU2bNk1xcXHWlyIuXbpUYBgLCAiwtl/55/X6FrR9zpw5mjFjRjH2BgAAoPg8OhX76aefqkePHpoyZYpTqJMkb29vTZ48WT169NDatWtLpEh3TZw4UV5eXtqwYYPVFhgY6LSali//EWmBgYFOf7rT92qxsbFKTU21fk6cOFG8HQEAAPCAx6diBw8efM0+v//972/682QDAwNVqVIlJScnW21hYWHatGmTjDFOp2PzT72Gh4db/a5sv1JSUpJCQ0MLXK2TLq/yFbattJXGzYgBAEDp8GjFLiQkRMePH79mn19++aXAe8rdSOfPn9fZs2dVpUoVq61Fixa6ePGiy1Mzdu3aZW2XpIiICFWpUkV79+51GXf37t1WPwAAgLLKo2DXoUMHxcXFafv27QVu37Vrl1atWqUOHToUq7jCZGRk6Pz58y7ts2bNkjFGMTExVlufPn3k6+urxYsXW23GGL311luKiIhQdHS01d63b1+tXbvW6VTqxo0bFR8fr/79+9+QfQEAACgpHp2KnTZtmj799FN17txZAwYMUJcuXRQWFqZTp05p8+bNWrlypby8vDR16lSPilq0aJFSUlKsb6yuWbPGuj/eE088oXPnzqlly5YaOHCg9QixL774Qp999pliYmLUp08fa6y77rpLEyZM0Pz585Wdna3WrVvr448/1tatW7VixQqnawSnTp2qVatWqUuXLho/frzS09M1f/58NW3aVI899phH+wIAAHCzOIwxxpM3rl27VsOGDdO5c+ecrl0zxig0NFT/+Mc/9OCDD3pUVK1atQo91Xv06FFVqFBBTzzxhHbu3KmTJ08qNzdXdevW1eDBgzV58mT5+vo6vScvL0/z5s3TkiVLlJSUpHr16ik2NrbA6wQPHjyoSZMm6ZtvvpGfn5969eqlhQsXqmrVqm7XX5SH9d5oZeUau2Nze13+5cIFqXz5y7+np0tBQaVXFAAAt4Ci5AqPg50kXbhwQatXr9b/+3//T6mpqQoJCVHLli310EMPKeg2/gebYOeKYAcAgGeKkis8OhWbLygoSIMGDdKgQYOKMwwAAABKQLGfFQsAAICygWAHAABgEwQ7AAAAmyDYAQAA2ATBDgAAwCYIdgAAADbhUbDr2rWrpk+fXtK1AAAAoBg8CnY7d+5Ubm5uSdcCAACAYvAo2NWrV08nTpwo6VoAAABQDB4Fuz//+c/69NNP9csvv5R0PQAAAPCQR48U6927t9avX6/27dvr6aefVuvWrVWtWjU5HA6XvjVq1Ch2kQAAALg+j4Ld7373OzkcDhljNH78+EL7ORwO5eTkeFwcAAAA3OdRsBs6dGiBq3MAAAAoPR4Fu+XLl5dwGQAAACgublAMAABgEx6t2F3p8OHDOnTokNLT0/Xoo4+WRE0AAADwgMcrdvv371dkZKQaN26sfv36afjw4da2LVu2qFy5clqzZk1J1AgAAAA3eBTs4uPjdc899+h///d/NX78eN1///1O2zt16qTQ0FDFxcWVSJEAAAC4Po+C3YwZM5SVlaVdu3bp5ZdfVuvWrZ22OxwOtWvXTnv27CmRIgEAAHB9HgW7jRs36uGHH1ajRo0K7VO9enWdPHnS48IAAABQNB4Fu3Pnzumuu+66Zh9jjLKysjwqCgAAAEXnUbCrWrWq/vOf/1yzz8GDB1W9enWPigIAAEDReRTsunbtqjVr1uh///d/C9y+Z88ebdy4Uffdd1+xigMAAID7PAp2sbGx8vHxUadOnfTmm29a19IdPHhQb775pnr37q077rhDkydPLtFiAQAAUDiPblB8991368MPP9TAgQM1duxYSZevqWvWrJmMMapQoYI++ugj1ahRo0SLBQAAQOE8fvJETEyMjh49qnfeeUc7d+7Ub7/9ppCQELVt21aPPfaYQkNDS7JOAAAAXEexHilWoUIFjR8/XuPHjy+pegAAAOAhjx8pBgAAgLKlWMFuxYoV6tatm0JDQ+Xj46PQ0FB169ZNK1asKKn6AAAA4CaPTsVmZ2erX79+Wrt2rYwx8vb2VpUqVXT27Flt2rRJmzdv1gcffKC4uDj5+vqWdM0AAAAogEcrdnPmzNGaNWvUpk0bbdq0SRkZGUpKSlJGRoa++uorRUVFae3atZo3b15J1wsAAIBCOIwxpqhvqlu3rry8vPTDDz/Iz8/PZXtmZqaaNGkiY8x1n1BhR2lpaQoJCVFqaqqCg4NLtZZaz3xaqp+f79jcXpd/uXBBKl/+8u/p6VJQUOkVBQDALaAoucKjFbuEhAT16dOnwFAnSf7+/urTp48SExM9GR4AAAAe8CjYhYeHKzs7+5p9srOzFR4e7lFRAAAAKDqPgt2gQYMUFxentLS0ArenpKQoLi5OgwcPLlZxAAAAcJ9Hwe7ZZ59VZGSkoqKi9P777yshIUHZ2dlKSEjQihUr1LZtW0VFRWn69OklXS8AAAAK4dbtTry8vORwOFzajTF69NFHC2w/cuSIAgMDlZOTU/wqAQAAcF1uBbtOnToVGOwAAABQdrgV7DZv3nyDywAAAEBx8axYAAAAmyDYAQAA2IRHz4rNt2bNGu3fv9/6VuzVHA6H/v73vxfnIwAAAOAmj4Ld8ePH1bt3bx08eFDXeiIZwQ4AAODm8SjYjRs3Tj/88IP+9Kc/aejQoYqIiJCPT7EW/wAAAFBMHqWxr776Svfdd5/efvvtkq4HAAAAHvLoyxO+vr5q2rRpSdcCAACAYvAo2LVv314//PBDSdcCAACAYvAo2M2cOVNff/21/vnPf5Z0PQAAAPCQR9fYtWzZUhs3blSvXr20ZMkStWrVSiEhIS79HA6Hpk+fXuwiAQAAcH0eBbvU1FRNnTpVycnJ2rJli7Zs2VJgP4IdAADAzeNRsJs4caI2bdqk7t2769FHH1V4eDi3OwEAAChlHqWxtWvXKjo6Wl9++WVJ1wMAAAAPefTliUuXLik6OrqkawEAAEAxeBTsWrZsqZ9//rmkawEAAEAxeBTspk+frjVr1uibb74p6XoAAADgIY+usUtKStIDDzygrl27atCgQfr9739f4O1OJGno0KHFKhD2UOuZTyVJgVkZOlTKtQAAYFcOY4wp6pu8vLzkcDh05VsdDodTH2OMHA6HcnNzi1/lLSYtLU0hISFKTU1VcHBwqdaSH6jKisCsDB16pd/lF+npUlBQ6RYEAEAZV5Rc4dGK3bJlyzwqzB3p6emaP3++du3apd27d+vcuXNatmyZhg8f7tL30KFDmjhxor755hv5+fmpV69eevnll1WlShWnfnl5eVqwYIHefPNNJSUlqX79+oqNjdXAgQM9HhMAAKCs8SjYDRs2rKTrsJw9e1YzZ85UjRo11Lx5c23evLnAfgkJCerUqZNCQkI0e/Zspaena8GCBTpw4IB2794tPz8/q++0adM0d+5cjRgxQq1bt9bq1as1aNAgORwODRgwwKMxAQAAypoyd1fhsLAwJSUlqVq1atq7d69at25dYL/Zs2frwoUL+vbbb1WjRg1JUlRUlO69914tX75cI0eOlCQlJiZq4cKFGjNmjBYtWiRJ+vOf/6zOnTtrypQp6t+/v7y9vYs0JgAAQFnk0bdibyR/f39Vq1btuv0+/PBDPfDAA1YAk6Tu3burfv36+uCDD6y21atXKzs7W6NHj7baHA6HRo0apYSEBO3YsaPIYwIAAJRFHq3Y/e53v3Orn8Ph0E8//eTJR1xTYmKiTp8+rcjISJdtUVFR+uyzz6zX+/btU1BQkBo2bOjSL397hw4dijQmAABAWeRRsMvLy3P5FqwkpaSkKDU1VZIUHh4uX1/f4lVXiKSkJEmXT9teLSwsTMnJycrMzJS/v7+SkpJUtWpVl3rz33vy5Mkij3m1zMxMZWZmWq/T0tI83DMAAADPeRTsjh07Vui2//znPxo3bpwuXLigL774wtO6runSpUuSVGDICggIsPr4+/tbf16rX1HHvNqcOXM0Y8YMT3YFAACgxJT4NXZ169bVRx99pMTExBsWdgIDAyXJaZUsX0ZGhlOfwMBAt/u5O+bVYmNjlZqaav2cOHGiSPsDAABQEm7IlycCAgJ07733auXKlTdieOt0af7p0yslJSUpNDTUWlkLCwvTqVOndPV9mPPfGx4eXuQxr+bv76/g4GCnHwAAgJvthn0r1sfHR6dOnbohY0dERKhKlSrau3evy7bdu3erRYsW1usWLVro4sWLOnTI+UFWu3btsrYXdUwAAICy6IYEu7Nnz+rf//63qlevfiOGlyT17dtXa9eudTrtuXHjRsXHx6t///5WW58+feTr66vFixdbbcYYvfXWW4qIiFB0dHSRxwQAACiLPPryxMyZMwtsz8nJ0YkTJ7R69WqlpqZqzpw5HhW1aNEipaSkWN9YXbNmjRISEiRJTzzxhEJCQjR16lStWrVKXbp00fjx461HkTVt2lSPPfaYNdZdd92lCRMmaP78+crOzlbr1q318ccfa+vWrVqxYoV1c2JJbo8JAABQFjnM1RefucHL69oLfcHBwRo/frzHX56oVauWjh8/XuC2o0ePqlatWpKkgwcPatKkSU7PdV24cKGqVq3q9J68vDzNmzdPS5YsUVJSkurVq6fY2FgNHjzYZXx3x7yWojys90ar9cynpfr5VwvMytChV/pdfpGeLgUFlW5BAACUcUXJFR4Fuy1bthTY7uXlpYoVK6pBgwby8SlzTyu7aQh2hSPYAQBQNEXJFR6lr86dO3tUGAAAAG6cMvesWAAAAHjG7RW7vLw8jz7getfjAQAAoGS4Hew8ee6rw+FQTk5Okd8HAACAonM72FWvXl0Oh8Otvunp6frtt988LgoAAABF53awO3bs2HX7ZGdn64033tCLL74oSdZtSQAAAHDjldg9SVatWqXY2FgdPXpUISEheumllzRu3LiSGh5uKmu3NwEAADdPsYPd9u3bNXnyZO3atUs+Pj4aN26cnn32WVWsWLEk6gMAAICbPA52P/30k55++mn9+9//ljFG/fr105w5c1SnTp2SrA8AAABuKnKwS05O1owZM7RkyRJlZWWpXbt2Wrhwodq2bXsj6gMAAICb3A52WVlZevXVVzV37lylpKSoTp06mjt3rvr27Xsj6wMAAICb3A52d999t3755ReFhobq1Vdf1ZgxY+Tt7X0jawMAAEARuB3sjh8/LofDIWOMFixYoAULFlz3PQ6HQ8ePHy9WgQAAAHBPka6xM8YoOTlZycnJN6oeAAAAeOiGPysWAAAAN4dXaRcAAACAkkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANuFT2gXg9tVw+jpd8gtwajs2t1cpVQMAwK2PFTsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATt2yw27x5sxwOR4E/O3fudOq7fft2dejQQeXKlVO1atU0btw4paenu4yZmZmpp59+WuHh4QoMDFSbNm20fv36m7VLAAAAxeJT2gUU17hx49S6dWuntrp161q/79+/X926dVPDhg318ssvKyEhQQsWLNCRI0f0+eefO71v+PDhiouL04QJE1SvXj0tX75cPXv21KZNm9ShQ4ebsj8AAACeuuWDXceOHdWvX79Ct0+dOlUVK1bU5s2bFRwcLEmqVauWRowYoS+//FI9evSQJO3evVv//Oc/NX/+fE2ePFmSNHToUDVp0kRPPfWUtm/ffuN3BgAAoBhu2VOxVzp//rxycnJc2tPS0rR+/XoNGTLECnXS5cBWvnx5ffDBB1ZbXFycvL29NXLkSKstICBAjz/+uHbs2KETJ07c2J0AAAAopls+2D322GMKDg5WQECAunTpor1791rbDhw4oJycHEVGRjq9x8/PTy1atNC+ffustn379ql+/fpOAVCSoqKiJF0+pQsAAFCW3bKnYv38/NS3b1/17NlTlStX1o8//qgFCxaoY8eO2r59u1q2bKmkpCRJUlhYmMv7w8LCtHXrVut1UlJSof0k6eTJk4XWkpmZqczMTOt1Wlqax/sFAADgqVs22EVHRys6Otp6/eCDD6pfv35q1qyZYmNjtW7dOl26dEmS5O/v7/L+gIAAa7skXbp0qdB++dsLM2fOHM2YMcPjfQEAACgJt/yp2CvVrVtXffr00aZNm5Sbm6vAwEBJclpNy5eRkWFtl6TAwMBC++VvL0xsbKxSU1OtH67HAwAApeGWXbErTPXq1ZWVlaULFy5Yp1HzT8leKSkpSeHh4dbrsLAwJSYmFthPklPfq/n7+xe42gcAAHAz2S7Y/fzzzwoICFD58uXVpEkT+fj4aO/evfrjH/9o9cnKytL+/fud2lq0aKFNmzYpLS3N6QsUu3btsraXNbWe+bS0SwAAAGXILXsq9syZMy5t3333nT755BP16NFDXl5eCgkJUffu3fXee+/p/PnzVr93331X6enp6t+/v9XWr18/5ebmaunSpVZbZmamli1bpjZt2qh69eo3docAAACK6ZZdsXvkkUcUGBio6Oho3Xnnnfrxxx+1dOlSlStXTnPnzrX6vfjii4qOjlbnzp01cuRIJSQkaOHCherRo4diYmKsfm3atFH//v0VGxur06dPq27dunrnnXd07Ngx/f3vfy+NXQQAACiSW3bF7qGHHtLZs2f18ssva/To0frXv/6lhx9+WHv37lXDhg2tfq1atdKGDRsUGBioiRMnaunSpXr88ccVFxfnMub//M//aMKECXr33Xc1btw4ZWdna+3aterUqdPN3DUAAACPOIwxprSLsJu0tDSFhIQoNTXV5YbHJelWvMYuMCtDh165/Ai4hhPjdMkvwGn7sbm9SqMsAADKrKLkilt2xQ4AAADOCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA24VPaBQBXKugxaTxmDAAA97BiBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANiET2kXAFxPrWc+dWk7NrdXKVQCAEDZxoodAACATRDsAAAAbIJgBwAAYBMEOwAAAJsg2AEAANgEwQ4AAMAmCHYAAAA2QbADAACwCYIdAACATRDsAAAAbIJgBwAAYBMEOwAAAJvwKe0CAE/UeuZTl7Zjc3uVQiUAAJQdrNgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGAT3McOtnH1ve24rx0A4HbDih0AAIBNEOwAAABsgmAHAABgE1xjB9viebIAgNsNK3YAAAA2wYodbius4gEA7IwVu6tkZmbq6aefVnh4uAIDA9WmTRutX7++tMsCAAC4LlbsrjJ8+HDFxcVpwoQJqlevnpYvX66ePXtq06ZN6tChQ2mXhxuAVTwAgF0Q7K6we/du/fOf/9T8+fM1efJkSdLQoUPVpEkTPfXUU9q+fXspVwgAAFA4gt0V4uLi5O3trZEjR1ptAQEBevzxxzV16lSdOHFC1atXL8UKcbO4u4rHah8AoCwh2F1h3759ql+/voKDg53ao6KiJEn79+8n2N3GCgpx7vQrTtAjOAIAioJgd4WkpCSFhYW5tOe3nTx5ssD3ZWZmKjMz03qdmpoqSUpLS7sBVf6fvMyLN3T8GyE3K0P5s5KbeVF5Jq9U67kZakxcVWbH+2HGfS5tTZ77wq1+AICbIz9PGGOu25dgd4VLly7J39/fpT0gIMDaXpA5c+ZoxowZLu2s7hUsJP+XxUNLswxICnm1ZPsBAG6c8+fPKyQk5Jp9CHZXCAwMdFp5y5eRkWFtL0hsbKwmTZpkvc7Ly1NycrIqVaokh8NxQ2pNS0tT9erVdeLECZdTx7g25s5zzJ3nmDvPMXeeY+48V5bmzhij8+fPKzw8/Lp9CXZXCAsLU2Jiokt7UlKSJBU6of7+/i4rfRUqVCjx+goSHBxc6gfcrYq58xxz5znmznPMneeYO8+Vlbm73kpdPm5QfIUWLVooPj7e5dq4Xbt2WdsBAADKKoLdFfr166fc3FwtXbrUasvMzNSyZcvUpk0brpkDAABlGqdir9CmTRv1799fsbGxOn36tOrWrat33nlHx44d09///vfSLs+Jv7+/nnvuuQK/7IFrY+48x9x5jrnzHHPnOebOc7fq3DmMO9+dvY1kZGRo+vTpeu+993Tu3Dk1a9ZMs2bN0n33cbsHAABQthHsAAAAbIJr7AAAAGyCYAcAAGATBLtbTGZmpp5++mmFh4crMDBQbdq00fr160u7rFKxefNmORyOAn927tzp1Hf79u3q0KGDypUrp2rVqmncuHFKT093GdOO85uenq7nnntOMTExCg0NlcPh0PLlywvse+jQIcXExKh8+fIKDQ3Vo48+qjNnzrj0y8vL00svvaTatWsrICBAzZo108qVK4s1Zlnk7twNHz68wOOwQYMGLn1vl7nbs2ePxo4dq8aNGysoKEg1atTQH//4R8XHx7v05bhz5u7ccdy5OnjwoPr376/f/e53KleunCpXrqxOnTppzZo1Ln1te9wZ3FIGDBhgfHx8zOTJk82SJUtMu3btjI+Pj9m6dWtpl3bTbdq0yUgy48aNM++++67Tz5kzZ6x++/btMwEBAaZly5bmzTffNNOmTTP+/v4mJibGZUw7zu/Ro0eNJFOjRg1zzz33GElm2bJlLv1OnDhhKleubOrUqWNee+018+KLL5qKFSua5s2bm8zMTKe+zzzzjJFkRowYYZYuXWp69eplJJmVK1d6PGZZ5O7cDRs2zPj7+7sch5988olL39tl7vr27WuqVatmnnjiCfO3v/3NzJo1y1StWtUEBQWZAwcOWP047ly5O3ccd64+/fRTc99995nnn3/eLF261Lz66qumY8eORpJZsmSJ1c/Oxx3B7haya9cuI8nMnz/fart06ZKpU6eOadeuXSlWVjryg92qVauu2e/+++83YWFhJjU11Wr729/+ZiSZL774wmqz6/xmZGSYpKQkY4wxe/bsKTScjBo1ygQGBprjx49bbevXr3f5D2JCQoLx9fU1Y8aMsdry8vJMx44dzV133WVycnKKPGZZ5e7cDRs2zAQFBV13vNtp7rZt2+byj1l8fLzx9/c3gwcPtto47ly5O3ccd+7JyckxzZs3N3fffbfVZufjjmB3C5kyZYrx9vZ2CijGGDN79mwjyfzyyy+lVFnpuDLYpaWlmezsbJc+qampxsfHx0yZMsWpPTMz05QvX948/vjjVtvtML/XCid33nmn6d+/v0t7/fr1Tbdu3azX//3f/20kmYMHDzr1e//9940kp9VNd8e8FbgT7HJyclyOnyvdrnN3pVatWplWrVpZrznu3Hf13HHcue+BBx4wVatWtV7b+bjjGrtbyL59+1S/fn2XZ9ZFRUVJkvbv318KVZW+xx57TMHBwQoICFCXLl20d+9ea9uBAweUk5OjyMhIp/f4+fmpRYsW2rdvn9V2O89vYmKiTp8+7TJP0uX9v3qegoKC1LBhQ5d++duLOqYdXLx4UcHBwQoJCVFoaKjGjBnjch3n7T53xhj9+uuvqly5siSOu6K4eu7ycdwV7MKFCzp79qx++uknvfLKK/r888/VrVs3SfY/7njyxC0kKSlJYWFhLu35bSdPnrzZJZUqPz8/9e3bVz179lTlypX1448/asGCBerYsaO2b9+uli1bKikpSZIKnbetW7dar2/n+b3ePCUnJyszM1P+/v5KSkpS1apV5XA4XPpJ/zdPRRnzVhcWFqannnpKrVq1Ul5entatW6fFixfru+++0+bNm+Xjc/k/tbf73K1YsUKJiYmaOXOmJI67orh67iSOu2t58skntWTJEkmSl5eXHn74YS1atEiS/Y87gt0t5NKlSwUeFAEBAdb220l0dLSio6Ot1w8++KD69eunZs2aKTY2VuvWrbPmpLB5u3LObuf5vd485ffx9/d3e56KMuatbs6cOU6vBwwYoPr162vatGmKi4vTgAEDJLl/jNlx7g4fPqwxY8aoXbt2GjZsmCSOO3cVNHcSx921TJgwQf369dPJkyf1wQcfKDc3V1lZWZLsf9xxKvYWEhgYqMzMTJf2jIwMa/vtrm7duurTp482bdqk3Nxca04Km7cr5+x2nt/rzdOVfdydp6KMaUcTJ06Ul5eXNmzYYLXdrnN36tQp9erVSyEhIYqLi5O3t7ckjjt3FDZ3heG4u6xBgwbq3r27hg4dqrVr1yo9PV29e/eWMcb2xx3B7hYSFhZmLfdeKb8tPDz8ZpdUJlWvXl1ZWVm6cOGCtSxe2LxdOWe38/xeb55CQ0Ot/6cZFhamU6dOyVz1NMKr56koY9pRYGCgKlWqpOTkZKvtdpy71NRU3X///UpJSdG6detc/jcncdwV5lpzVxiOu4L169dPe/bsUXx8vO2PO4LdLaRFixaKj49XWlqaU/uuXbus7ZB+/vlnBQQEqHz58mrSpIl8fHycvlAhSVlZWdq/f7/TnN3O8xsREaEqVaq4zJMk7d6922WeLl68qEOHDjn1u3qeijKmHZ0/f15nz55VlSpVrLbbbe4yMjLUu3dvxcfHa+3atWrUqJHTdo67wl1v7grDcVew/FOlqamp9j/ubtr3b1FsO3fudLnPWkZGhqlbt65p06ZNKVZWOk6fPu3Stn//fuPr62sefPBBqy0mJsaEhYWZtLQ0q+3tt982ksznn39utd0O83utW3b89a9/NYGBgU63ddmwYYORZN58802r7cSJE4Xe1ykiIsLpvk7ujnkrKGzuLl265HRs5ZsyZYqRZD766COr7Xaau5ycHPPggw8aHx8f8+mnnxbaj+POlTtzx3FXsF9//dWlLSsry7Rq1coEBgaa8+fPG2PsfdwR7G4x/fv3t+7LtmTJEhMdHW18fHzMli1bSru0m65Lly6mZ8+e5oUXXjBLly41EyZMMOXKlTMhISHmxx9/tPp9++23xt/f3+nJEwEBAaZHjx4uY9p1ft944w0za9YsM2rUKCPJPPzww2bWrFlm1qxZJiUlxRhjzC+//GIqVapk6tSpY15//XUze/ZsU7FiRdO0aVOTkZHhNF7+Px4jR440f/vb36w7sa9YscKpX1HGLKuuN3dHjx41FSpUMKNGjTKvvfaaee2110zPnj2NJBMTE2Nyc3Odxrtd5m78+PFGkundu7fLkxHeffddqx/HnSt35o7jrmAPPfSQ6dq1q3n++eetp3Y0aNDASDILFy60+tn5uCPY3WIuXbpkJk+ebKpVq2b8/f1N69atzbp160q7rFLx2muvmaioKBMaGmp8fHxMWFiYGTJkiDly5IhL361bt5ro6GgTEBBgqlSpYsaMGVPg/9u16/zWrFnTSCrw5+jRo1a/H374wfTo0cOUK1fOVKhQwQwePNicOnXKZbzc3Fwze/ZsU7NmTePn52caN25s3nvvvQI/290xy6rrzd25c+fMkCFDTN26dU25cuWMv7+/ady4sZk9e7bJyspyGe92mbvOnTsXOm9XnyziuHPmztxx3BVs5cqVpnv37qZq1arGx8fHVKxY0XTv3t2sXr3apa9djzuHMVddEQgAAIBbEl+eAAAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQCgxB07dkwOh0PDhw8v7VKA2wrBDkCxxMfHa9KkSWrVqpVCQ0Pl6+ur0NBQtWnTRpMnT9a3335b2iUCwG2DZ8UC8IgxRjNnztTMmTOVl5enVq1aKSoqSqGhoTp//ry+//577dixQ1lZWVq0aJHGjBlT2iXjJsrOztZPP/2kkJAQhYWFlXY5wG3Dp7QLAHBrmjlzpp5//nlVr15dK1euVPv27V36nD59Wq+++qpSU1NLoUKUJl9fXzVo0KC0ywBuO5yKBVBkP//8s1544QX5+fnp888/LzDUSdKdd96p2bNn66mnnnLZdvHiRc2ZM0ctWrRQUFCQypcvr3bt2mnlypUufTdv3iyHw6Hnn39e+/fvV69evVShQgWVK1dOnTt31vbt2wv8/JycHC1evFht27ZVcHCwypUrp5YtW2rRokXKy8sr0j4nJycrNjZWDRs2VGBgoEJCQtStWzd9+eWXTv0++ugjORwOtW3bVtnZ2U7bfvjhB5UrV07h4eE6ffq01V6rVi3VqlVLqampGjt2rCIiIhQQEKBGjRrp9ddf19UnVq68fi0+Pl6PPPKI7rzzTnl5eWnz5s1Wvy+++EI9e/ZU5cqV5e/vrzp16mjKlClKSUlx2b/vv/9eAwcOVK1ateTv768qVaqoVatWmjBhgtN+nD9/XrNmzVKTJk0UHBysO+64Q3Xq1NEjjzzidNr9WtfYJSUlacyYMapVq5b8/PxUpUoVPfzwwwWetl++fLkcDoeWL1+uTZs26Z577tEdd9yh4OBg9erVS4cOHSrw7wu4bRkAKKL/+q//MpLMoEGDPHr/uXPnTMuWLY0k06pVKzN27FgzevRoU6dOHSPJTJs2zan/pk2bjCTTq1cvExgYaLp27WqefPJJ079/f+Pl5WUCAgLM4cOHnd6TlZVl7rvvPiPJ3H333eYvf/mLGT9+vGnWrJmRZIYMGeJ2vceOHTO1atUykkzHjh3NhAkTzIgRI0xYWJhxOBxm6dKlTv3HjBljJJkpU6ZYbRcuXDANGzY0Xl5e5quvvnLqX7NmTRMWFmYiIyNN3bp1zaRJk8zYsWNNWFiYkWRGjx7t1P/o0aNGkunQoYOpUKGCiYqKMhMmTDB/+ctfzLfffmuMMeb55583kkxoaKgZOnSomTx5sunRo4eRZBo1amRSU1Ot8b777jsTEBBgAgMDzSOPPGKeeeYZM3r0aNOjRw/j6+trzp8/b4wxJi8vz0RHRxtJpl27dmbixIlmypQpZuDAgaZatWrmjTfecKlx2LBhTrX//PPPJjw83EgyXbt2Nc8884wZPHiw8fPzM35+fmbNmjVO/ZctW2Ykmb59+xofHx/Tu3dvM3nyZNOzZ08jyVSpUsWcOXPG7b9LwO4IdgCKrEuXLkaSefvttz16/7Bhw4wkM2/ePKf2S5cumfvuu884HA6zb98+qz0/2Ekyy5Ytc3rPW2+9ZSSZUaNGObU/99xzRpIZO3asycnJsdpzcnLMn/70JyPJfPzxx27V27lzZ+NwOMzKlSud2s+dO2eaN29uAgICzKlTp6z2jIwM07JlS+NwOMznn39ujDFm+PDhRpJ59tlnXcavWbOmkWTat29vMjIyrPbffvvN/O53vzOSzJYtW6z2/NAkycTGxrqM99VXX1nh69y5c07b8oPShAkTrLZJkyYVOh/JyckmNzfXGGPM999/bySZhx56yKVfbm6uSU5Odqnx6mCXHy5feOEFp/Zt27YZb29vExoaagXJK+v19vY2GzZscHrPM888U+BxBNzOCHYAiqxhw4ZGkhVarnT06FHz3HPPOf288sor1vazZ88ab29vExkZWeDY+/fvd1ntyg927du3d+mflZVlfHx8zO9//3urLTc314SGhppq1aqZ7Oxsl/ecO3fOOBwO079//+vua349/fr1K3D7xx9/bCSZ//7v/3Zqj4+PN+XLlzdVqlQx8+fPN5JMp06dnEJmvvxg9/XXX7tsyw82w4cPt9ryQ1PVqlWdgmC+hx56yEgyP/zwQ4E1t2jRwlSpUsV6nR/svvjii4In4f+XH+wGDhx4zX5X1nhlsDtx4oSRZGrUqGGysrJc3jNkyBAjybzzzjtWW/7+Dx482KX/zz//bK3mAbiML08AKFHHjh3TjBkznNpq1qypCRMmSJL27Nmj3Nxc65q5q+Vfz1XQtVORkZEubb6+vqpatarOnTtntcXHxys5OVn16tXTCy+8UGCdgYGBbl2ftWPHDklSampqgfWeOXOmwHrr1aunt956S0OGDNGUKVNUuXJlvf/++/L29i7wc3x8fBQdHe3Sfs8990iS9u3b57KtefPm8vf3L7BmX19frVq1SqtWrXLZnpWVpTNnzui3335TpUqV9Mgjj+i1117TQw89pH79+ql79+5q37696tSp4/S+Ro0aqUWLFlq5cqWOHz+uPn36qEOHDoqMjJSfn1+B+3Wl/H3o2LGjfH19XbZ37dpV7733nvbt26ehQ4c6bSvo77569eqS5PR3D9zuCHYAiqxatWo6dOiQTp486bLtnnvusS72z8nJcfkH/LfffpN0OeDt2bOn0M9IT093aatQoUKBfX18fJSbm+vyGUeOHHEJmdf7jKvlj7V+/XqtX7++SGP16NFDwcHBSktLU//+/RUREVHo+ytXrlxg6KtWrZokFfjN4vxtBdWck5NzzX3Pr7lSpUqKiorS1q1b9eKLLyouLk7vvvuuJOnuu+/Wc889p4EDB0qSvL299dVXX2nmzJmKi4vT008/LUm64447NGzYMM2ZM0fly5cv9PPy96Gw25/ktxf05Y6C/u59fC7/E3bl3z1wu+NbsQCKLP9bsBs3bizye0NCQiRJEydOlLl8OUiBP5s2bfK4vvzP+MMf/nDNzzh69KjbY7322mvXHGvZsmVO7zPGaOjQoUpLS1PlypW1dOlSff3114V+ztmzZwsMKKdOnXKq40oOh6PQmitWrHjNeo0xqlmzpvWedu3aae3atTp37py2bdum6dOn69dff9WgQYO0YcMGq1/FihX1yiuv6MSJEzpy5IjefvttNWjQQIsWLdKoUaOuMZP/tw/5+3S1pKSkQvcVgHsIdgCKbPjw4fLx8VFcXFyRbzcRFRUlLy8vbd269QZVJzVo0EAVKlTQzp07XW45UlRt27aVpCLXO3/+fK1bt06DBw/WV199JV9fXw0aNMhaAbxaTk5Ogbdtyb99ScuWLYtU87lz53Tw4MEi1SxJ/v7+io6O1syZM/X6669LklavXl1g37p16+rxxx/Xli1bVL58+UL75cvfh2+++UY5OTku2/PDfKtWrYpcN4DLCHYAiqxOnTr6r//6L2VlZen+++8v9D5yBZ1Su/POOzV48GDt3btXs2bNKnCV6qeffnJrNa0wPj4+euKJJ5SUlKRx48bp0qVLLn2SkpL0448/XnesyMhIdezYUR999JH+8Y9/FNjnwIEDTvel27lzp6ZNm6a6devqzTffVNOmTfXKK68oMTFRw4YNc7kvXb7Y2FhlZmZar5OTk61rBB977LHr1ppv4sSJkqQRI0YUeLr8woUL2rlzp/V6+/btBc7Rr7/+KkkqV66cJOno0aP6+eefXfqdO3dOmZmZCgwMvGZdd911l+69914dO3ZMr776qtO2Xbt26f3331fFihX1hz/84do7CKBQXGMHwCPPPvusjDGaNWuW2rdvr9///vfWI8VSUlJ07Ngx6xRep06dnN67aNEiHTlyRM8++6zeffdddejQQVWrVtXJkyd16NAh7dmzRytXrlTt2rU9rm/69On67rvv9NZbb2nNmjXq2rWrIiIidPr0aR05ckTbtm3Tiy++qEaNGl13rPfff19du3bV448/rtdff11t2rRRhQoVlJCQoO+//14//PCDduzYoTvvvFMpKSkaOHCgvLy89M9//lN33HGHJOmvf/2rNm7cqLi4OL388st68sknnT4jLCxMmZmZatKkiR588EFlZ2crLi5OSUlJGj16tMscXku3bt00d+5cxcbGql69eurZs6dq166t9PR0HT9+XFu2bFGHDh20bt06SdJLL72kr776Sh07dlTt2rVVvnx5HTx4UJ9//rkqVqyokSNHSpK+++47Pfzww2rdurUaNmyo8PBwnTlzRqtXr1Z2drZ1zd21vPXWW2rfvr2mTJmiL7/8UpGRkTpx4oRWrVolLy8vLVu2zJozAB644d+7BWBrhw8fNhMmTDDNmzc3ISEhxsfHx1SsWNFERkaaCRMmWDfMvVpmZqZ54403TLt27UxwcLDx8/Mz1atXN127djWvvPKKOXv2rNU3/3Ynzz33XIFj1axZ09SsWdOlPS8vz/zP//yP6dq1q6lYsaLx9fU14eHhpn379ubFF180v/zyi9v7mZaWZl588UXTqlUrExQUZAICAkytWrVMz549zZIlS0x6eroxxpiHH37YSDIvv/yyyxgpKSmmdu3axtfX1+zatcul/pSUFDN69GgTHh5u/Pz8TIMGDcxrr71m8vLynMYp7B5xV9u6davp37+/CQsLM76+vqZy5cqmefPmZuLEiWbPnj1Wvy+++MIMHz7cNGzY0AQHB5ty5cqZ+vXrmyeeeMIcO3bM6nfixAkTGxtroqOjTdWqVY2fn5+JiIgwMTEx5rPPPnO7xoSEBPPXv/7V1KhRw/j6+ppKlSqZPn36mN27d7v0zb/dydX3L8wnyXTu3Pma8wDcThzGFHJOAABwU9SqVUvS5VvFAEBxcI0dAACATRDsAAAAbIJgBwAAYBNcYwcAAGATrNgBAADYBMEOAADAJgh2AAAANkGwAwAAsAmCHQAAgE0Q7AAAAGyCYAcAAGATBDsAAACbINgBAADYxP8HYA/ZiZsFDREAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "mito_genes = scprep.select.get_gene_set(EBT_counts, starts_with=\"MT-\") # Get all mitochondrial genes. There are 14, FYI.\n",
    "scprep.plot.plot_gene_set_expression(EBT_counts, genes=mito_genes, percentile=90)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we see that above the top 90th percentile, there is a steep increase in expression of mitochondrial RNAs. We'll remove these cells from further analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "EBT_counts, sample_labels = scprep.filter.filter_gene_set_expression(\n",
    "    EBT_counts, sample_labels, genes=mito_genes, \n",
    "    percentile=90, keep_cells='below')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Transformation\n",
    "\n",
    "In scRNA-seq analysis, the data is often $\\log$-transformed. This typically requires the addition of some small value to avoid taking $\\log(0)$. We avoid this issue entirely by instead taking the square root transform. The square root function has a similar form as the $\\log$ function with the added benefit of being stable at 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "EBT_counts = scprep.transform.sqrt(EBT_counts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A1BG (ENSG00000121410)</th>\n",
       "      <th>A1BG-AS1 (ENSG00000268895)</th>\n",
       "      <th>A2M (ENSG00000175899)</th>\n",
       "      <th>A2M-AS1 (ENSG00000245105)</th>\n",
       "      <th>A2ML1 (ENSG00000166535)</th>\n",
       "      <th>A4GALT (ENSG00000128274)</th>\n",
       "      <th>AAAS (ENSG00000094914)</th>\n",
       "      <th>AACS (ENSG00000081760)</th>\n",
       "      <th>AADAT (ENSG00000109576)</th>\n",
       "      <th>AAED1 (ENSG00000158122)</th>\n",
       "      <th>...</th>\n",
       "      <th>ZWILCH (ENSG00000174442)</th>\n",
       "      <th>ZWINT (ENSG00000122952)</th>\n",
       "      <th>ZXDA (ENSG00000198205)</th>\n",
       "      <th>ZXDB (ENSG00000198455)</th>\n",
       "      <th>ZXDC (ENSG00000070476)</th>\n",
       "      <th>ZYG11A (ENSG00000203995)</th>\n",
       "      <th>ZYG11B (ENSG00000162378)</th>\n",
       "      <th>ZYX (ENSG00000159840)</th>\n",
       "      <th>ZZEF1 (ENSG00000074755)</th>\n",
       "      <th>ZZZ3 (ENSG00000036549)</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>AAACCGTGCAGAAA-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACGCACCGGTAT-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.022861</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAACGCACCTATTC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.112210</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.11221</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGATCTCTGCTC-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.352958</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>AAAGATCTGGTACT-1_Day 00-03</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00000</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 17845 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                            A1BG (ENSG00000121410)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0   \n",
       "\n",
       "                            A1BG-AS1 (ENSG00000268895)  A2M (ENSG00000175899)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                         0.0                    0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                         0.0                    0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                         0.0                    0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                         0.0                    0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                         0.0                    0.0   \n",
       "\n",
       "                            A2M-AS1 (ENSG00000245105)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                        0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                        0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                        0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                        0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                        0.0   \n",
       "\n",
       "                            A2ML1 (ENSG00000166535)  A4GALT (ENSG00000128274)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                      0.0                       0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                      0.0                       0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                      0.0                       0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                      0.0                       0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                      0.0                       0.0   \n",
       "\n",
       "                            AAAS (ENSG00000094914)  AACS (ENSG00000081760)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0                     0.0   \n",
       "\n",
       "                            AADAT (ENSG00000109576)  AAED1 (ENSG00000158122)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                      0.0                      0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                      0.0                      0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                      0.0                      0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                      0.0                      0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                      0.0                      0.0   \n",
       "\n",
       "                            ...  ZWILCH (ENSG00000174442)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03  ...                  0.000000   \n",
       "AAACGCACCGGTAT-1_Day 00-03  ...                  1.022861   \n",
       "AAACGCACCTATTC-1_Day 00-03  ...                  0.000000   \n",
       "AAAGATCTCTGCTC-1_Day 00-03  ...                  0.000000   \n",
       "AAAGATCTGGTACT-1_Day 00-03  ...                  0.000000   \n",
       "\n",
       "                            ZWINT (ENSG00000122952)  ZXDA (ENSG00000198205)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                 0.000000                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                 0.000000                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                 1.112210                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                 1.352958                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                 0.000000                     0.0   \n",
       "\n",
       "                            ZXDB (ENSG00000198455)  ZXDC (ENSG00000070476)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                     0.0                     0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                     0.0                     0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                     0.0                     0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                     0.0                     0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                     0.0                     0.0   \n",
       "\n",
       "                            ZYG11A (ENSG00000203995)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                       0.0   \n",
       "AAACGCACCGGTAT-1_Day 00-03                       0.0   \n",
       "AAACGCACCTATTC-1_Day 00-03                       0.0   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                       0.0   \n",
       "AAAGATCTGGTACT-1_Day 00-03                       0.0   \n",
       "\n",
       "                            ZYG11B (ENSG00000162378)  ZYX (ENSG00000159840)  \\\n",
       "AAACCGTGCAGAAA-1_Day 00-03                       0.0                0.00000   \n",
       "AAACGCACCGGTAT-1_Day 00-03                       0.0                0.00000   \n",
       "AAACGCACCTATTC-1_Day 00-03                       0.0                1.11221   \n",
       "AAAGATCTCTGCTC-1_Day 00-03                       0.0                0.00000   \n",
       "AAAGATCTGGTACT-1_Day 00-03                       0.0                0.00000   \n",
       "\n",
       "                            ZZEF1 (ENSG00000074755)  ZZZ3 (ENSG00000036549)  \n",
       "AAACCGTGCAGAAA-1_Day 00-03                      0.0                     0.0  \n",
       "AAACGCACCGGTAT-1_Day 00-03                      0.0                     0.0  \n",
       "AAACGCACCTATTC-1_Day 00-03                      0.0                     0.0  \n",
       "AAAGATCTCTGCTC-1_Day 00-03                      0.0                     0.0  \n",
       "AAAGATCTGGTACT-1_Day 00-03                      0.0                     0.0  \n",
       "\n",
       "[5 rows x 17845 columns]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "EBT_counts.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(16821, 17845)"
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "EBT_counts.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = EBT_counts.to_numpy()\n",
    "from sklearn.decomposition import PCA\n",
    "X_pca = PCA(n_components=5).fit_transform(X)\n",
    "X_pca /= np.std(X_pca, axis = 0, keepdims=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [],
   "source": [
    "import anndata as ad\n",
    "\n",
    "s = EBT_counts.copy()\n",
    "df = s.reset_index()\n",
    "x = list(zip(*df[\"index\"].str.split(\"_\").tolist()))\n",
    "cell_id, timepoint = list(x[0]), list(x[1])\n",
    "df = df.drop(columns=\"index\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [],
   "source": [
    "adata = ad.AnnData(X=X_pca)\n",
    "adata.obsm['X_pca'] = X_pca\n",
    "adata.obs['cell_id'] = cell_id\n",
    "adata.obs['timepoint'] = timepoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [],
   "source": [
    "import scanpy as sc\n",
    "adata.write_h5ad('EB_tong.h5ad')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:dmae]",
   "language": "python",
   "name": "conda-env-dmae-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
