## Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration
#### Part of Supplementary Material for ICLR 2021 Submission \#1552

This codebase is for ICLR 2021 Submission \#1552, **Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration**.
It includes the implementation for the exploration method with Drop-Bottleneck and the DMLab navigation tasks.

### Requirements
* Ubuntu 16.04 machine
* sudo privileges (for installing dependencies)
* CUDA-compatible GPUs
* [Anaconda](https://docs.anaconda.com/anaconda/install/)

### Environment setup
1. Install dependencies for the DMLab pip package following the [instructions](https://github.com/deepmind/lab/tree/master/python/pip_package)
2. (In the main directory) create conda env and activate it by running:
    ```
    conda env create -f environment.yml
    conda activate db-expl
    ```
    Make sure that the environment variable `CONDA_PREFIX` is properly set.
3. Clone [DMLab](https://github.com/deepmind/lab) and build and install DMLab with essential modifications:
    ```
    git clone https://github.com/deepmind/lab
    cd lab
    git checkout 7b851dcbf6171fa184bf8a25bf2c87fe6d3f5380
    git apply ../third_party/dmlab/dmlab_min_goal_distance.patch
    git apply ../third_party/dmlab/dmlab_conda.patch
    bash ./build.sh
    ```
    `build.sh` will try to install some required packages by running sudo commands.


### Training

In order to obtain Table 1 from the main draft, the following training commands need to be run:
| Command                                                                                    | Reward condition  | Noise setting  | Average Reward Sum  |
|:------------------------------------------------------------------------------------------ |:-----------------:|:--------------:|:-------------------:|
| `python scripts/launcher_script_rlb.py --scenario sparse --noise_type image_action`        |       Sparse      |  Image Action  |         30.4        |
| `python scripts/launcher_script_rlb.py --scenario sparse --noise_type noise`               |       Sparse      |      Noise     |         32.7        |
| `python scripts/launcher_script_rlb.py --scenario sparse --noise_type noise_action`        |       Sparse      |  Noise Action  |         30.6        |
| `python scripts/launcher_script_rlb.py --scenario verysparse --noise_type image_action`    |    Very Sparse    |  Image Action  |         28.8        |
| `python scripts/launcher_script_rlb.py --scenario verysparse --noise_type noise`           |    Very Sparse    |      Noise     |         29.1        |
| `python scripts/launcher_script_rlb.py --scenario verysparse --noise_type noise_action`    |    Very Sparse    |  Noise Action  |         26.9        |

Note that the results in Table 1 are the test reward sums averaged over 30 runs for each setting and the actual images used for "Image Action" tasks are not included in this codebase since we do not hold the right to distribute them.

### Evaluation
* Each training command creates an experiment directory under `exp` directory.
* Experiment directories contain `reward_test.csv` files, which list episode reward sums in test environments at each evaluation step.
* From the `reward_test.csv` files, final test reward sums can be obtained by taking the values at step 20M (20044800). Thus, gathering the results can be easily done with shell commands such as
    ```
    grep "20044800," exp/*/reward_test.csv
    ```

### Acknowledgments
This source code is based on the official implementation for [Episodic Curiosity Through Reachability](https://github.com/google-research/episodic-curiosity).
