Abstract: Highlights•We develop a feature allocation model for grouped data with binary attributes and demonstrate its use on n-gram data.•Show how the model can be estimated using a simple, exact Markov chain Monte Carlo method.•Introduce a post-hoc variable selection step which finds variable that maximally discriminate among groups.•The variable selection method leads to better out-of-sample classification accuracy in simulated and real data.
Loading