# XLM-Indic

Use the provided script to do the following in this order.
1. Download oscar corpus.
2. Train non transliteration sentence piece tokenizer.
3. Create corpus for training non transliteration model.
4. Pretrain the non transliteration model.
5. Create transliteration corpus for sentence piece tokenzier training.
6. Train transliteration sentence piece tokenizer.
7. Create corpus for training transliteration model.
8. Pretrain the transliteration model.
9. Run the fine tuning scirpts for the non transliteration model.
10. Run the fine tuning scripts for the transliteration model.
