Bayesian Checking for Topic ModelsDownload PDFOpen Website

2011 (modified: 10 Nov 2022)EMNLP 2011Readers: Everyone
Abstract: Real document collections do not fit the independence assumptions asserted by most statistical topic models, but how badly do they violate them? We present a Bayesian method for measuring how well a topic model fits a corpus. Our approach is based on posterior predictive checking, a method for diagnosing Bayesian models in user-defined ways. Our method can identify where a topic model fits the data, where it falls short, and in which directions it might be improved.
0 Replies

Loading