Evaluation sents contain the sentences used for the self-BLEU tests in the SUNMASK paper. Folders denoted by s_* represent "standard" sampling, and the others represent typical sampling.

numbers such as 1pt0, pt8 and so on represent the temperature used, and numbers such as T, 2T, 5T, 10T, represent multiples of the training length (T = 52 for EMNLP2017 News experiments) used for inference. 

The resulting plot of the analysis for these results is seen in Figure 2.
