1. Run create_raw.py to download original corpus and simply process it for following process.
3. Run process.py to process the dataset, dataset will be stored in "python".

