AUBER
===

This package provides implementations of AUBER.

## Overview
#### Code structure
``` Unicode
auber_submission
  │ 
  ├── src
  │    │     
  │    ├── finetuned_models
  │    │     └── mrpc_original: BERT finetuned on MRPC (for demo)
  │    │      
  │    ├── lib
  │    │     ├── agent.py: code for agent
  │    │     ├── memory.py: code for memory
  │    │     └── reward.py: code for reward
  │    │  
  │    ├── script
  │    │     ├── MRPC_train.py: script for training a pruned BERT on MRPC (for demo)
  │    │     └── MRPC_eval.py: script for evaluating a pruned BERT on MRPC (for demo)
  │    │ 
  │    ├─── utils
  │    │     ├── default_param.py: default cfgs
  │    │     └── utils.py: utility functions
  │    │ 
  │    ├─── transformers: refer to https://github.com/huggingface/transformers/
  │    │     
  │    ├─── main.py: main file to run AUBER
  │    │    
  │    └─── train.py: code for training the agent
  │
  ├─── data: GLUE data
  │
  └─── script: shell scripts for demo
```

#### Data description
* MRPC: Microsoft Research Paraphrase Corpus
* Note:
    * Other GLUE datasets can be downloaded from https://github.com/nyu-mll/GLUE-baselines
    * In each dataset directory, there should be two folders, `train` and `dev`.
    In `dev`, there should be a copy of train and dev datasets.
    In `train`, there should be two copies of the train dataset, one named as `train.tsv` and the other named as `dev.tsv`.

#### Output
* Trained models will be saved in `src/trained_models/[MODEL_NAME]_[TASK_NAME]_[LAYER_NUMBER]` after training.
* You can test the model only if:
    * There is a finetuned model saved in `src/finetuned_models/`.
    * There are train/evaluate scripts for the pruned model saved in `src/script/`.
        * train/evaluate scripts should take three arguments: model path, gpu to use, and the dataset on which the model will be evaluated.

## Install
#### Environment 
* Unbuntu
* CUDA 10.0
* Python 3.8.1
* torch 1.4.0
* torchvision 0.5.0
* sklearn
* transformers

## How to use 
    cd auber_submission/src/
    git clone https://github.com/huggingface/transformers
    cd transformers
    git checkout f5c2a122e34836b87abb6042cf641b040e790e1c
    pip install .
    mv ../run_glue.py ./examples/text-classification
    cd ..
#### DEMO
* Download a BERT model fine-tuned on MRPC from the following link and place it in `src/finetuned_models/`: https://bit.ly/3n9Kwcw
* To train the model on the MRPC dataset, run script:
    ```    
    cd script
    ./demo.sh
    ```
    Intermediate models after pruning each layer will be saved in `src/trained_models/`.

