{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rFIC6q_M8PvT"
      },
      "source": [
        "## Reaction Bench: Lesson 1\n",
        "\n",
        "### Part 1:\n",
        "\n",
        "In this lesson, I will be taking you through how our reaction bench environment works and how an RL agent might interact with the environment.\n",
        "\n",
        "The reaction bench environment is meant to as it sounds simulate a reaction, in most reaction benches the agent will have a number of reagents and the ability to play with the environmental conditions of the reaction and through doing this the agent is trying to maximize the yield of a certain desired material. For the reaction bench we use a reaction file which specifies the mechanics of a certain reaction or multiple reactions. For instance the Wurtz reaction is made up of 6 different reactions and as such is a very complicated reaction which the agent has to try and learn the mechanisms of the reaction environment it is in. For this lesson we will be using a simplified version of the wurtz reaction to introduce you to how actions affect the environment."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "X_RXYShU8PvW"
      },
      "source": [
        "Below is just some simple code that loads our desired environment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DVyZUMdB8R5g"
      },
      "outputs": [],
      "source": [
        "!pip install \"git+https://github.com/chemgymrl/chemgymrl.git@main\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "rxFOGAnx8PvX"
      },
      "outputs": [],
      "source": [
        "import gymnasium as gym\n",
        "import chemistrylab\n",
        "import matplotlib,time\n",
        "import numpy as np\n",
        "from matplotlib import pyplot as plt\n",
        "from chemistrylab.util import Visualization\n",
        "from IPython.display import display,clear_output\n",
        "\n",
        "Visualization.use_mpl_dark(size=2)\n",
        "\n",
        "print(\"\\n\".join([ a for a in gym.registry.keys() if \"React\" in a or \"Extract\" in a or \"Distill\" in a]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xErvTzYm8PvY"
      },
      "source": [
        "First let's load up the environment, I highly recommend you look at the source code for the reaction bench and\n",
        "reaction, it should help provide insight into how this all works. Further the lesson on creating a custom reaction\n",
        "environment will also help give insight into the reaction mechanics."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WAWSPrvU8PvY"
      },
      "source": [
        "If you run the cell below you will see a graph appear that looks something like this:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "IYE3sz2e8PvZ",
        "scrolled": false
      },
      "outputs": [],
      "source": [
        "env = gym.make('GenWurtzReact-v2')\n",
        "env.reset()\n",
        "rgb = env.render()\n",
        "plt.imshow(rgb)\n",
        "plt.axis(\"off\")\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QP_B0MMd8PvZ"
      },
      "source": [
        "Understanding the graph above is important to understanding how the agent will have to understand the environment.\n",
        "On the left we can see the absorbance spectra of the materials in our reaction vessel, and on the right we have\n",
        "a relative scale of a number of important metrics. From left to right we have time passed, temperature, volume (solvent)\n",
        ", presure, and the quantity of reagents that we have available to use. All of this data is what the RL agent has inorder\n",
        "for it to try and optimize the reaction pathway.\n",
        "\n",
        "The reaction we are using is as follows:\n",
        "\n",
        "2 1-chlorohexane + 2 Na --> dodecane + 2 NaCl\n",
        "\n",
        "This reaction is performed in an aqueous state with ethoxyethane as the solvent.\n",
        "\n",
        "With all that out of the way let's focus our attention to the action space. For this reaction environemnt our action\n",
        "space is represented by a 5 element vector.\n",
        "\n",
        "|              | Temperature | 1-chlorohexane | 2-chlorohexane | 3-chlorohexane | Na  |\n",
        "|--------------|-------------|----------------|----------------|----------------|-----|\n",
        "| Value range: | 0-1         | 0-1            | 0-1            | 0-1            | 0-1 |\n",
        "\n",
        "As you might have noticed now, the reaction bench environment deals with a continuous action space. So what exactly do\n",
        "these continuous values represent? For the environmental conditions, in this case Volume and Temperature 0 represents a\n",
        "decrease in temperature  or volume by dT or dV (specified in the reaction bench), 1/2 represents no change, and\n",
        "1 represents an increase by dT or dV. For the chemicals, 0 represents adding no amount of that chemical to the reaction\n",
        "vessel, and 1 represents adding all of the originally available chemical (there is a negative reward if you try to add\n",
        "more chemical than is available).\n",
        "\n",
        "Below you will find a code cell that will allow you to interact with the gym environment, I highly encourage you to play around with different actions and to not the rewards as well."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "BpqZ5OoO8PvZ"
      },
      "outputs": [],
      "source": [
        "d = False\n",
        "state = env.reset()\n",
        "total_reward = 0\n",
        "\n",
        "action = np.ones(env.action_space.shape[0])\n",
        "print(f'Target: {env.target_material}')\n",
        "for i, a in enumerate(env.actions):\n",
        "    v,event = env.shelf[a[0][0][0]],a[0][0][1]\n",
        "    action[i] = float(input(f'{v}: {event.name} -> {event.other_vessel}| '))\n",
        "\n",
        "\n",
        "while not d:\n",
        "    action = np.clip(action,0,1)\n",
        "    o, r, d, *_ = env.step(action)\n",
        "    total_reward += r\n",
        "    time.sleep(0.1)\n",
        "    clear_output(wait=True)\n",
        "    print(f'reward: {r}')\n",
        "    print(f'total_reward: {total_reward}')\n",
        "    rgb = env.render()\n",
        "    plt.imshow(rgb)\n",
        "    plt.axis(\"off\")\n",
        "    plt.show()\n",
        ""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nVOzDVNn8Pva"
      },
      "source": [
        "### Part 2:\n",
        "\n",
        "\n",
        "Here I will provide instructions on how to maximize the return of this reaction environment.\n",
        "\n",
        "This is fairly simple for this task and have thus provided some script which demonstrates our strategy, and I encourage\n",
        "you to try your own strategy and see how it performs. In this case cour strategy is to maximize the temperature and add in all our reagents. For example, to make dodecane we want to add 1-chlorohexane and Na. This gives us an\n",
        "action vector of:\n",
        "\n",
        "| Temperature | 1-chlorohexane | 2-chlorohexane | 3-chlorohexane | Na  |\n",
        "|-------------|----------------|----------------|----------------|-----|\n",
        "| 1         | 1            | 0            | 0            | 1 |\n",
        "\n",
        "\n",
        "To see this in action run the following code cell:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "CCGRVWAn8Pva"
      },
      "outputs": [],
      "source": [
        "def predict(observation):\n",
        "    t = np.argmax(observation[-7:])\n",
        "    #targs = {0: \"dodecane\", 1: \"5-methylundecane\", 2: \"4-ethyldecane\", 3: \"5,6-dimethyldecane\", 4: \"4-ethyl-5-methylnonane\", 5: \"4,5-diethyloctane\", 6: \"NaCl\"}\n",
        "    actions=np.array([\n",
        "    [1,1,0,0,1],#dodecane\n",
        "    [1,1,1,0,1],#5-methylundecane\n",
        "    [1,1,0,1,1],#4-ethyldecane\n",
        "    [1,0,1,0,1],#5,6-dimethyldecane\n",
        "    [1,0,1,1,1],#4-ethyl-5-methylnonane\n",
        "    [1,0,0,1,1],#4,5-diethyloctane\n",
        "    [1,1,1,1,1],#NaCl\n",
        "    ],dtype=np.float32)\n",
        "    return actions[t]\n",
        "\n",
        "d=False\n",
        "o,*_=env.reset()\n",
        "total_reward=0\n",
        "while not d:\n",
        "    action = predict(o)\n",
        "    o, r, d, *_ = env.step(action)\n",
        "    total_reward += r\n",
        "    time.sleep(0.1)\n",
        "    clear_output(wait=True)\n",
        "    print(f\"Target: {env.target_material}\")\n",
        "    print(f\"Action: {action}\")\n",
        "    print(f'reward: {r}')\n",
        "    print(f'total_reward: {total_reward}')\n",
        "    rgb = env.render()\n",
        "    plt.imshow(rgb)\n",
        "    plt.axis(\"off\")\n",
        "    plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PsHcdCqa8Pva"
      },
      "source": [
        "Now we're done! I hope you have a better sense of how the reaction environment works and the process through which\n",
        "an RL agent must go through to learn the environment."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.13 |Anaconda, Inc.| (default, Feb 23 2021, 21:15:04) \n[GCC 7.3.0]"
    },
    "vscode": {
      "interpreter": {
        "hash": "928df2993789dc54629220469d2aa2c5bde6e75786cdddb015342ca5eb5b2bb1"
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}