# KNN-based regularization for smoothed vector quantization

This is the collection of Python code that was used for experiments in the manuscript entitled ``Pushing Toward the Simplex Vertices: A Simple Remedy for Code Collapse in Smoothed Vector Quantization''.

To replicate the discrete autoencoding experiment, run 
```
train_ae.py ImageNet path/to/dir_where_ImageNet_stored/ config/autoencoding/backbone_16x16x32x1024.yaml config/autoencoding/softmax_knnce_k2.yaml /path/to/dir_where_results_saved/ ...[more options]
```
where `backbone_16x16x32x1024.yaml` and `softmax_knnce_k2.yaml` can be replaced to train in different settings.


You have to download LibriSpeech dataset in your own.

To replicate the wav2vec 2.0 experiment, run

- (for single codebok) `pretrain_wav2vec2.py LibriSpeech /path/to/dir_where_LibriSpeech_stored/ config/wav2vec2/pretrain_single.yaml config/wav2vec2/softmax_knnce_k2.yaml /path/to/dir_where_results_saved/ ...[more options]`
- (for dual codebook) `pretrain_wav2vec2_dual.py LibriSpeech /path/to/dir_where_LibriSpeech_stored/ config/wav2vec2/pretrain_dual.yaml config/wav2vec2/softmax_knnce_k2.yaml /path/to/dir_where_results_saved/ ...[more options]`

You have to download LibriSpeech dataset in your own.

Dependent Python packages are listed in `conda_config.yaml`