Aggregate and mixed-order Markov models for statistical language processing

Lawrence K. Saul, Fernando Pereira

1997 (modified: 16 Jul 2019)EMNLP 1997Readers: Everyone

Abstract: We consider the use of language models whose size and accuracy are intermediate between different order n-gram models. Two types of models are studied in particular. Aggregate Markov models are classbased bigram models in which the mapping from words to classes is probabilistic. Mixed-order Markov models combine bigram models whose predictions are conditioned on different words. Both types of models are trained by ExpectationMaximization (EM) algorithms for maximum likelihood estimation. We examine smoothing procedures in which these models are interposed between different order n-grams. This is found to significantly reduce the perplexity of unseen word combinations.

0 Replies