* random_bert_cluter.txt: state-word relation inferred from random encoder. These states are much noisier than states induced from a pretrained BERT
* pretrained_bert_cluster.txt: state-word relation inferred from pretrained BERT. 

Try search [##es] in the two files and see the differences. 
