# Speech2Latex
This folder contains the scripts we used in our experiments. Five environments were used for this purpose. All of them are stored in the folder: /TO_SUBMIT/envs. Below are the folders with the environments and the folders where the scripts use them

1. anon (ASRPostCorrection)
2. antispoofing
3. epictool (ASRVadProcessor,ExtraASR)
4. nemo (ExtraASR)
5. tts (EngTTS)

## Data
Both our datasets and external datasets are stored here. Also separated data for training, test and validation only in Russian and English separately.

## ASRPostCorrection
Here are the scripts for ASR post-correction. Experiments were made with T5, Qwen, Whisper, Wav2Vec, WavLm, Canary models.    

## Salmonn
This folder contains the multimodal model that provides the best results in our work. To run it inference, you need to execute the following command:

'''bash
python inference-cli.py --cfg-path /path/to/decode_config.yaml --device cuda:{your cuda}
'''
