{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inversion Demonstration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook will demonstrate a few variations on the inversion operation, in which data returned from automunge is inverted to recover its original form prior to transofmations. This type of operation may be useful for instance to recover the original form of labels after an inference operation, or to otherwise recover original form of train or test data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Automunge is available now for pip install:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install Automunge"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or to upgrade (we currently roll out upgrades pretty frequently):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install Automunge --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once installed, run this in a local session to initialize:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from Automunge import *\n",
    "am = AutoMunge()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll demonstrate feature importance on the Titanic set, a common benchmark."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "#titanic set\n",
    "df_train = pd.read_csv('train.csv')\n",
    "df_test = pd.read_csv('test.csv')\n",
    "\n",
    "#labels and ID columns\n",
    "labels_column = 'Survived'\n",
    "trainID_column = 'PassengerId'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>113803</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived  Pclass  \\\n",
       "0            1         0       3   \n",
       "1            2         1       1   \n",
       "2            3         1       3   \n",
       "3            4         1       1   \n",
       "4            5         0       3   \n",
       "\n",
       "                                                Name     Sex   Age  SibSp  \\\n",
       "0                            Braund, Mr. Owen Harris    male  22.0      1   \n",
       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "2                             Heikkinen, Miss. Laina  female  26.0      0   \n",
       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
       "4                           Allen, Mr. William Henry    male  35.0      0   \n",
       "\n",
       "   Parch            Ticket     Fare Cabin Embarked  \n",
       "0      0         A/5 21171   7.2500   NaN        S  \n",
       "1      0          PC 17599  71.2833   C85        C  \n",
       "2      0  STON/O2. 3101282   7.9250   NaN        S  \n",
       "3      0            113803  53.1000  C123        S  \n",
       "4      0            373450   8.0500   NaN        S  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_train.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to demonstrate inversion, first we'll apply a forward pass of transformations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "train, train_ID, labels, \\\n",
    "val, val_ID, val_labels, \\\n",
    "test, test_ID, test_labels, \\\n",
    "postprocess_dict \\\n",
    "= am.automunge(df_train,\n",
    "               labels_column = labels_column,\n",
    "               trainID_column = trainID_column,\n",
    "               shuffletrain = False,\n",
    "               printstatus = False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sex_bnry</th>\n",
       "      <th>Age_nmbr</th>\n",
       "      <th>SibSp_nmbr</th>\n",
       "      <th>Parch_nmbr</th>\n",
       "      <th>Fare_nmbr</th>\n",
       "      <th>Pclass_NArw</th>\n",
       "      <th>Pclass_1010_0</th>\n",
       "      <th>Pclass_1010_1</th>\n",
       "      <th>Name_NArw</th>\n",
       "      <th>Name_hash_0</th>\n",
       "      <th>...</th>\n",
       "      <th>Cabin_1010_1</th>\n",
       "      <th>Cabin_1010_2</th>\n",
       "      <th>Cabin_1010_3</th>\n",
       "      <th>Cabin_1010_4</th>\n",
       "      <th>Cabin_1010_5</th>\n",
       "      <th>Cabin_1010_6</th>\n",
       "      <th>Cabin_1010_7</th>\n",
       "      <th>Embarked_NArw</th>\n",
       "      <th>Embarked_1010_0</th>\n",
       "      <th>Embarked_1010_1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>-0.592148</td>\n",
       "      <td>0.432550</td>\n",
       "      <td>-0.473408</td>\n",
       "      <td>-0.502163</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>179</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>0.638430</td>\n",
       "      <td>0.432550</td>\n",
       "      <td>-0.473408</td>\n",
       "      <td>0.786404</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>432</td>\n",
       "      <td>...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>-0.284503</td>\n",
       "      <td>-0.474279</td>\n",
       "      <td>-0.473408</td>\n",
       "      <td>-0.488580</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>964</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>0.407697</td>\n",
       "      <td>0.432550</td>\n",
       "      <td>-0.473408</td>\n",
       "      <td>0.420494</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>711</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>0.407697</td>\n",
       "      <td>-0.474279</td>\n",
       "      <td>-0.473408</td>\n",
       "      <td>-0.486064</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>832</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 44 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   Sex_bnry  Age_nmbr  SibSp_nmbr  Parch_nmbr  Fare_nmbr  Pclass_NArw  \\\n",
       "0         1 -0.592148    0.432550   -0.473408  -0.502163            0   \n",
       "1         0  0.638430    0.432550   -0.473408   0.786404            0   \n",
       "2         0 -0.284503   -0.474279   -0.473408  -0.488580            0   \n",
       "3         0  0.407697    0.432550   -0.473408   0.420494            0   \n",
       "4         1  0.407697   -0.474279   -0.473408  -0.486064            0   \n",
       "\n",
       "   Pclass_1010_0  Pclass_1010_1  Name_NArw  Name_hash_0  ...  Cabin_1010_1  \\\n",
       "0              1              0          0          179  ...             0   \n",
       "1              0              0          0          432  ...             1   \n",
       "2              1              0          0          964  ...             0   \n",
       "3              0              0          0          711  ...             0   \n",
       "4              1              0          0          832  ...             0   \n",
       "\n",
       "   Cabin_1010_2  Cabin_1010_3  Cabin_1010_4  Cabin_1010_5  Cabin_1010_6  \\\n",
       "0             0             0             0             0             0   \n",
       "1             0             1             0             0             0   \n",
       "2             0             0             0             0             0   \n",
       "3             1             1             0             1             1   \n",
       "4             0             0             0             0             0   \n",
       "\n",
       "   Cabin_1010_7  Embarked_NArw  Embarked_1010_0  Embarked_1010_1  \n",
       "0             0              0                1                0  \n",
       "1             1              0                0                0  \n",
       "2             0              0                1                0  \n",
       "3             1              0                1                0  \n",
       "4             0              0                1                0  \n",
       "\n",
       "[5 rows x 44 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    0\n",
       "1    1\n",
       "2    1\n",
       "3    1\n",
       "4    0\n",
       "Name: Survived_ordl, dtype: uint8"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "labels.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that hashing is performed for unbounded categoric sets, such as Name and Cabin. We'll see below that inversion is not supported for those specific transforms."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now when it comes time to invert, we can pass the postprocess_dict with the target set to postmunge and apply the inversion parameter.\n",
    "\n",
    "The inversion parameter can be passed as:\n",
    "- False for no inversion\n",
    "- 'test' to invert a train or test set\n",
    "- 'labels' to invert a labels set\n",
    "- 'denselabels' to redundantly invert all configurations of a labels set (this one is kind of esoteric)\n",
    "- or a list of column headers to only invert a subset of columns\n",
    "\n",
    "Note that as kind of a quark postmunge only returns three sets with inversion."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_______________\n",
      "Begin Postmunge processing\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Sex\n",
      "Inversion path selected based on returned column  Sex_bnry\n",
      "With full recovery.\n",
      "Recovered source column:  Sex\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Age\n",
      "Inversion path selected based on returned column  Age_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  Age\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  SibSp\n",
      "Inversion path selected based on returned column  SibSp_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  SibSp\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Parch\n",
      "Inversion path selected based on returned column  Parch_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  Parch\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Fare\n",
      "Inversion path selected based on returned column  Fare_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  Fare\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Pclass\n",
      "Inversion path selected based on returned column  Pclass_1010_0\n",
      "With full recovery.\n",
      "Recovered source column:  Pclass\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Name\n",
      "No inversion path available for source column:  Name\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Ticket\n",
      "No inversion path available for source column:  Ticket\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Cabin\n",
      "Inversion path selected based on returned column  Cabin_1010_0\n",
      "With full recovery.\n",
      "Recovered source column:  Cabin\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Embarked\n",
      "Inversion path selected based on returned column  Embarked_1010_0\n",
      "With full recovery.\n",
      "Recovered source column:  Embarked\n",
      "\n",
      "Inversion succeeded in recovering original form for columns:\n",
      "['Sex', 'Age', 'SibSp', 'Parch', 'Fare', 'Pclass', 'Cabin', 'Embarked']\n",
      "\n"
     ]
    }
   ],
   "source": [
    "df_invert, recovered_list, inversion_info_dict = \\\n",
    "am.postmunge(postprocess_dict, train, \n",
    "             inversion='test', \\\n",
    "             printstatus=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>0.0</td>\n",
       "      <td>7.250002</td>\n",
       "      <td>3</td>\n",
       "      <td>A10</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>0.0</td>\n",
       "      <td>71.283295</td>\n",
       "      <td>1</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>5.960464e-08</td>\n",
       "      <td>0.0</td>\n",
       "      <td>7.925001</td>\n",
       "      <td>3</td>\n",
       "      <td>A10</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1.000000e+00</td>\n",
       "      <td>0.0</td>\n",
       "      <td>53.099998</td>\n",
       "      <td>1</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>5.960464e-08</td>\n",
       "      <td>0.0</td>\n",
       "      <td>8.050001</td>\n",
       "      <td>3</td>\n",
       "      <td>A10</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Sex   Age         SibSp  Parch       Fare Pclass Cabin Embarked\n",
       "0    male  22.0  1.000000e+00    0.0   7.250002      3   A10        S\n",
       "1  female  38.0  1.000000e+00    0.0  71.283295      1   C85        C\n",
       "2  female  26.0  5.960464e-08    0.0   7.925001      3   A10        S\n",
       "3  female  35.0  1.000000e+00    0.0  53.099998      1  C123        S\n",
       "4    male  35.0  5.960464e-08    0.0   8.050001      3   A10        S"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_invert.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the columns without successful inversion (such as 'Name' and 'Cabin') are not included in the returned set."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, we can invert a labels set. This might be useful for the sets returned from an inference operation. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_______________\n",
      "Begin Postmunge processing\n",
      "\n",
      "Performing inversion recovery of original columns for label set.\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Survived\n",
      "Inversion path selected based on returned column  Survived_ordl\n",
      "With full recovery.\n",
      "Recovered source column:  Survived\n",
      "\n",
      "Inversion succeeded in recovering original form for columns:\n",
      "['Survived']\n",
      "\n"
     ]
    }
   ],
   "source": [
    "df_invert, recovered_list, inversion_info_dict = \\\n",
    "am.postmunge(postprocess_dict, labels, \n",
    "             inversion='labels', \\\n",
    "             printstatus=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Survived</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Survived\n",
       "0       0.0\n",
       "1       1.0\n",
       "2       1.0\n",
       "3       1.0\n",
       "4       0.0"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_invert.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Will demonsrtate two more quick things. \n",
    "\n",
    "- When we only want to invert a subset of the columns we can pass those targets as a list.\n",
    "- Note that the targets for inversion can also be passed as numpy arrays instead of dataframes. So column headers are not required, but the correct order of transformed columns must be intact."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "_______________\n",
      "Begin Postmunge processing\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Age\n",
      "Inversion path selected based on returned column  Age_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  Age\n",
      "\n",
      "Evaluating inversion paths for columns derived from:  Fare\n",
      "Inversion path selected based on returned column  Fare_nmbr\n",
      "With full recovery.\n",
      "Recovered source column:  Fare\n",
      "\n",
      "Inversion succeeded in recovering original form for columns:\n",
      "['Age', 'Fare']\n",
      "\n"
     ]
    }
   ],
   "source": [
    "np_train = train.to_numpy()\n",
    "\n",
    "df_invert, recovered_list, inversion_info_dict = \\\n",
    "am.postmunge(postprocess_dict, np_train, \n",
    "             inversion= ['Age', 'Fare'], \\\n",
    "             printstatus=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>22.0</td>\n",
       "      <td>7.250002</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>38.0</td>\n",
       "      <td>71.283295</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>26.0</td>\n",
       "      <td>7.925001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>35.0</td>\n",
       "      <td>53.099998</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>35.0</td>\n",
       "      <td>8.050001</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Age       Fare\n",
       "0  22.0   7.250002\n",
       "1  38.0  71.283295\n",
       "2  26.0   7.925001\n",
       "3  35.0  53.099998\n",
       "4  35.0   8.050001"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_invert[['Age', 'Fare']].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>Fare</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>22.0</td>\n",
       "      <td>7.2500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>38.0</td>\n",
       "      <td>71.2833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>26.0</td>\n",
       "      <td>7.9250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>35.0</td>\n",
       "      <td>53.1000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>35.0</td>\n",
       "      <td>8.0500</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Age     Fare\n",
       "0  22.0   7.2500\n",
       "1  38.0  71.2833\n",
       "2  26.0   7.9250\n",
       "3  35.0  53.1000\n",
       "4  35.0   8.0500"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#For comparison this is the original data\n",
    "\n",
    "df_train[['Age', 'Fare']].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
