# A Deep Dive into Dataset Imbalance and Bias in Face Identification
This repository includes the code required to train and evaluate face identification models
trained with different ratios of gender presentation in the data. 
### Data:

CelebA dataset can be downloaded at https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
Please, split the dataset into *CelebA/train* and *CelebA/test* folders containing train and test identities
correspondingly. Also, prepare *demographics.txt* file containing a dictionary with male and female identities. 

## Training a model

To train a MobileFaceNet face recognition model with CosFace head on data with 4:6 ratio of male to female identities run

```$ python3 train_balanced.py --backbone_name MobileFaceNet --head_name CosFace --p_identities 0.4 0.6 --p_images 0.4 0.6 --data_train_root CelebA/train --data_test_root CelebA/test --demographics CelebA_demographics.txt --groups_to_modify male female```

To train a MobileFaceNet face recognition model with CosFace head on data with 4:6 ratio of male to female images run
```$ python3 train_balanced.py --backbone_name MobileFaceNet --head_name CosFace --p_identities 1.0 1.0 --p_images 0.4 0.6 --data_train_root CelebA/train --data_test_root CelebA/test --demographics CelebA_demographics.txt --groups_to_modify male female```


## Testing a model

To test a pretrained model on a gallery of test images with 4:6 ratio of male to female identities: 
```$ python3 fairness_test_Celeba.py --p_identities 0.4 0.6 --p_images 0.4 0.6 --seed 1 2 3 4 5 --backbone_name MobileFaceNet --checkpoint *path_to_trained_checkpoint*  --file_name *output_file_name*```
