# Deep Learning-based Source Code Complexity Prediction 



## For pre-training our model,

1) First, download the data for pre-training from the following link: http://gofile.me/6UFJt/ZvOndXucc

2) Move the downloaded data files in the directory named `data` .

3) Run the following command.

``` bash
python transformer_pretrain.py --model CodeBERT --save step 500 
```



## For fine-tuning our model,

- Run the following command to reproduce the results of baseline models:

```bash 
python train.py \
 --model CodeBERT \
 --train_path train_0_fold.txt \
 --valid_path test_0_fold.txt \
 --epoch 15 \
 --batch 6 \
 --device cuda:0 \
```

- Run the following command to reproduce the results of our hierarchical Transformer model:

```bash 
python train.py --model comple \
 --submodule CodeBERT \
 --train_path train_0_fold.txt \
 --valid_path test_0_fold.txt \
 --epoch 15 \
 --batch 6 \
 --device cuda:0 \
 --s \
```


## For evaluation of the trained models,

1) First change the name of the trained model file as follows:

- Models trained on random split data:  `r_{model_name}.pt`	
  - ex) `r_GraphCodeBERT.pt` , `r_comple_CodeBERT.pt` 

- Models trained on k-fold data: `{num of fold}\_fold_{model name}.pt`
  - ex) `0_fold_GraphCodeBERT.pt` , `1_fold_comple_CodeBERT.pt` 

2) If you would like to skip the training process and just evaluate the fine-tuned model by the authors, please download the model from the following link and evaluate the model using the following commands: http://gofile.me/6UFJt/Dg8JA39QW.

3) Move the model files to `experiments_model` directory.

4) Run the following commands for random split evaluation and k-fold (problem) split evaluation:

- Evaluation on random split data:

```bash
python eval_model.py \
 --model GraphCodeBERT \
 --test_path test_r.txt
```

- Evaluation on k-fold split data:

```bash
python eval_k_fold.py \
 --model comple \
 --submodule CodeBERT
 --pretrain
```