# rf-autoencoders
Random Forest Autoencoders for Guided Representation Learning

### This project has been realised on a Linux machine. Compatibility with other operating systems is not guaranteed.

### [Link to the datasets](https://udemontreal-my.sharepoint.com/:f:/g/personal/adrien_aumon_umontreal_ca/EuAs4d04PLxEuP3c5qBevKABNmcNw0_9HkQTeDo5wKRaEQ?e=uZkIR9)

## Installation

Clone the repo and create the environment for the project.

Load anaconda in cluster,
```sh
module purge
module load anaconda/3
```

```sh
conda create -n rfae python=3.9.20
conda activate rfae
```


Alternatively, using venv,

```sh
module load python/3.9.20
python3 -m venv .venv
source .venv/bin/activate
```

Install the requirements and the packages.
```sh
pip install -r requirements.txt
pip install -e .
```

For notebooks you may set this environment to your jupyter kernel.

```sh
conda install -c anaconda ipykernel
python -m ipykernel install --user --name=rfae
```

Make sure to add your data and log directory in the `.env` file. Follow the example (.env_example). Put your WandB credentials to track training losses.

## Run experiments

If you already installed the environment in conda previously, load it by:
```sh
module purge
module load anaconda/3
conda activate rfae
```

To run a test with default parameters: 
```sh
python runner/run_model_scores.py
```

**To run sweeps for the main results in our NeurIPS submission:**
10 repetitions on 20 datasets for 16 models (including ablation for the RF-AE loss balancing hyperparameter)
```sh
python runner/run_model_scores.py experiment=neurips_sweep_datasets
```
**To run our noisy tree experiments in the Appendix**
```sh
python runner/run_noisy_tree.py
```

