# InvertNAS
This repository contains the official code for our paper:
"InvertNAS: An Invertible Architecture Performance Predictor for Neural Architecture Search."
We have released the AlignFlow and GVAE weights as pretrained models.
Our Invertible Neural Network (INN) layer is adapted from [jaekookang/invertible_neural_networks](https://github.com/jaekookang/invertible_neural_networks).  

## 1. Software environment
### Required Packages
- python==3.8
- tensorflow==2.10.0
- spektral==1.2.0
- wget
- nats_bench
- matplotlib
### Conda Environment (Example)

To set up the required environment using Conda, you can run the following commands:
```
conda create -n tf2 python==3.8
conda activate tf2
pip install tensorflow==2.10
pip install spektral==1.2.0
pip install wget
pip install nats_bench
pip install matplotlib
```
## 2. Data preprocessing
- The training script will download the preprocessed data automatically. This step can be skipped.
- We maintain a local copy of the preprocessed data on our institutional storage system.
### NAS-Bench-101
- Data preprocessing for NAS-Bench-101 is required tf1.x environment. We use conda for example.
- Set up tf1.x env (conda env as an example)
  ```
  conda create -n tf1.15 python==3.7
  conda activate tf1.15
  pip install tensorflow==1.15
  ```
- Install NAS-Bench-101 https://github.com/google-research/nasbench
  ```bash
  git clone https://github.com/google-research/nasbench
  cd nasbench
  pip install -e .
  ```
- Query the nb101 data (This step should run on tf1.x environment)
  ```python
  cd datasets
  python query_nb101.py
  ```
- Transform to spektral graph dataset (This step should run on tf2.x environment)
  ```python
  export PYTHONPATH=$PWD
  python datasets/nb101_dataset.py
  ```
- The preprocessed data will be saved in `NasBench101Dataset`
### NAS-Bench-201
- Install NAS-Bench-201 (`pip install nats_bench`)
- Follow this [instruction](https://github.com/D-X-Y/NATS-Bench#preparation-and-download) to download benchmark file, save the file to $TORCH_HOME usually is located in `~/.torch/`. The benchmark file we used is `NATS-tss-v1_0-3ffb9-simple`.
- Query the nb201 data
  ```python
  cd datasets
  python query_nb201.py
  ```
- Transform to spektral graph dataset
  ```python
  cd datasets
  python nb201_dataset.py 
  ```
- The preprocessed data will be saved in `NasBench201Dataset`
### NAS-Bench-301
- Inatall relative packages
```
pip install torch torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric
pip install nasbench301
pip install pathvalidate
pip install ConfigSpace==0.4.21
```
- Modify the lines `sample_archs = [config_space.sample_configuration() for i in range(20000)]` in `create_random_data.py` to adjust the amount of trainging data
- Sample training data
```
python datasets/nasbench301/create_random_data.py
```
## 3. Run experiment
We provide three experiments in our study:
- InvertNAS, which represents our main results.
- InvertNAS with Random Initial Selection, a variant used for ablation studies to evaluate the impact of initial sample selection.
- InvertNAS(single) with Random Initial Selection, a simplified version of our method used for further analysis, which represents the foundational design of our method.


### InvertNAS
This experiment uses InvertNAS with AlignFlow to select the initial samples.
The pretrained GVAE weights are located in the following directories:  
- `logs/phase1_nb101_CE_64/` for **NB101**  
- `logs/phase1_nb201_CE_64/` for **NB201**  
- `logs/phase1_nb301_CE_64_directed/` for **NB301**
You can skip the GVAE training step if you use these pretrained weights.
The pretrain weight of alignflow is at `flow2flow_weight/`.

#### - Pre-train (How to train GVAE from scratch)
- Modify line `train_phase = ` to  `train_phase = [1, 0]` in `trainGAE_two_phase.py` and `python trainGAE_two_phase_301.py`
- For NAS-Bench-101: `python trainGAE_two_phase.py --dataset nb101`
- For NAS-Bench-201: `python trainGAE_two_phase.py`
- For NAS-Bench-301: `python trainGAE_two_phase_301.py`
- Then move the pre-trained model to the path to match the following code in `trainGAE_two_phase.py`
```python
if dataset_name == 'nb101':
    pretrained_weight = 'logs/phase1_nb101_CE_64/modelGAE_weights_phase1'
else:
    pretrained_weight = 'logs/phase1_nb201_CE_64/modelGAE_weights_phase1'
```
- For NAS-Bench-301: move the pre-trained model to `logs/phase1_nb301_CE_64_directed/modelGAE_weights_phase1`
#### - Prepare zero-cost dataset 
- Clone and install `NASLib` repo ([NASLib](https://github.com/automl/NASLib))
- Put our modified code (located in `NASLib/`) into the corresponding directories of the official `NASLib` repository.
- Gernerate them by excute `NASLib/tutorial/gen_nb101_by_para.py` for nb101, `NASLib/tutorial/gen_nb201_by_para.py` for nb201 and `NASLib/tutorial/gen_nb301_by_para.py` for nb301 or directly use the ones we prepare in `/init_arch`
#### - Run experiment 
- Run `python trainGAE_{dataset}_{mode}.py`
- `{dataset}` could be `nb101` / `nb201` / `nb301`
- `{mode}` could be `align_zc` for InvertNAS, `align_zc_single` for InvertNAS(single) and `zc` for InvertNAS without AlignFlow
- For `nb201`, add `--dataset {dataset_name}` to swap among `cifar10-valid` / `cifar100` / `ImageNet16-120`
#### - Run experiment for multiple runs
```bash
#!/bin/bash
for seed in {0..9}; do
  python trainGAE_{dataset}_{mode}.py --seed $seed
done
```

### InvertNAS with Random Initial Selection
This experiment uses InvertNAS with randomly selected initial samples.
The pretrained GVAE weights are located in the following directories:  
- `logs/phase1_nb101_CE_64/` for **NB101**  
- `logs/phase1_nb201_CE_64/` for **NB201**  
- `logs/phase1_nb301_CE_64_directed/` for **NB301**

You can skip the GVAE training step if you use these pretrained weights.

#### - Pre-train (How to train GVAE from scratch)
- Modify line `train_phase = ` to  `train_phase = [1, 0]` in `trainGAE_two_phase.py` and `python trainGAE_two_phase_301.py`
- For NAS-Bench-101: `python trainGAE_two_phase.py --dataset nb101`
- For NAS-Bench-201: `python trainGAE_two_phase.py`
- For NAS-Bench-301: `python trainGAE_two_phase_301.py`
- Then move the pre-trained model to the path to match the following code in `trainGAE_two_phase.py`
```python
if dataset_name == 'nb101':
    pretrained_weight = 'logs/phase1_nb101_CE_64/modelGAE_weights_phase1'
else:
    pretrained_weight = 'logs/phase1_nb201_CE_64/modelGAE_weights_phase1'
```
- For NAS-Bench-301: move the pre-trained model to `logs/phase1_nb301_CE_64_directed/modelGAE_weights_phase1`
#### - Run experiment for multiple runs
- Modify line `train_phase = ` to `train_phase = [0, 1]` in `trainGAE_ensemble.py` and `trainGAE_ensemble_301.py`
- For NB101 and 201,Run the shell script, `--dataset` Could be `nb101, cifar10-valid, cifar100, ImageNet16-120`: 
```
#!/bin/bash

for seed in {0..9}
do
    python trainGAE_ensemble.py --seed $seed --dataset nb101
done
```

- For NB301, run the shell script:
```
#!/bin/bash

for seed in {0..9}
do
    python trainGAE_ensemble_301.py --seed $seed 
done
```


### InvertNAS(single) with Random Initial Selection
This experiment uses InvertNAS(single) with randomly selected initial samples.
The pretrained GVAE weights are located in the following directories:  
- `logs/phase1_nb101_CE_64/` for **NB101**  
- `logs/phase1_nb201_CE_64/` for **NB201**  
- `logs/phase1_nb301_CE_64_directed/` for **NB301**

You can skip the GVAE training step if you use these pretrained weights.

#### - Pre-train (How to train GVAE from scratch)
- Modify line `train_phase = ` to  `train_phase = [1, 0]` in `trainGAE_two_phase.py`and `python trainGAE_two_phase_301.py`
- For NAS-Bench-101: `python trainGAE_two_phase.py --dataset nb101`
- For NAS-Bench-201: `python trainGAE_two_phase.py`
- For NAS-Bench-301: `python trainGAE_two_phase_301.py`
- Then move the pre-trained model to the path to match the following code in `trainGAE_two_phase.py` 
```python
if dataset_name == 'nb101':
    pretrained_weight = 'logs/phase1_nb101_CE_64/modelGAE_weights_phase1'
else:
    pretrained_weight = 'logs/phase1_nb201_CE_64/modelGAE_weights_phase1'
```
- For NAS-Bench-301: move the pre-trained model to `logs/phase1_nb301_CE_64_directed/modelGAE_weights_phase1`
#### - Run experiment for multiple runs
- Modify line `train_phase = ` to `train_phase = [0, 1]` in `trainGAE_two_phase.py`
- Run the shell script
```
#!/bin/bash

for seed in {0..9}
do
    python trainGAE_two_phase.py --seed $seed --dataset nb101
done
```
- For NB301, run the shell script
```
#!/bin/bash

for seed in {0..9}
do
    python trainGAE_two_phase_301.py --seed $seed 
done
```

## 4. Results and Logs
- The pickle record files will be saved in the folder:`InvertNAS/yyyymmdd-hhmmsstopk_finetuneFalse_rfinetuneFalse_rankTrue_randomSFalse_ensemble_2NN_4*5*256`
- The log files for each experiment are stored in the directory specified by the `get_logdir_and_logger()` function within the corresponding training code.
- The results will also be printed in the console during the experiment execution.



## Acknowledgement
Code base from
- [NAS-Bench-101](https://github.com/google-research/nasbench)
- [NAS-Bench-201](https://github.com/D-X-Y/NAS-Bench-201)
- [NAS-Bench-301](https://github.com/automl/nasbench301)
- [NATS-Bench (NAS-Bench-201)](https://github.com/D-X-Y/NATS-Bench)
- [Naszilla](https://github.com/naszilla/naszilla)
- [NASLib](https://github.com/automl/NASLib)
- [jaekookang/invertible_neural_networks](https://github.com/jaekookang/invertible_neural_networks)
- [alignflow](https://github.com/ermongroup/alignflow)