POS induction with distributional and morphological information using a distance-dependent Chinese restaurant processDownload PDF

2014 (modified: 16 Jul 2019)ACL (2) 2014Readers: Everyone
Abstract: We present a new approach to inducing the syntactic categories of words, combining their distributional and morphological properties in a joint nonparametric Bayesian model based on the distance-dependent Chinese Restaurant Process. The prior distribution over word clusterings uses a log-linear model of morphological similarity; the likelihood function is the probability of generating vector word embeddings. The weights of the morphology model are learned jointly while inducing part-ofspeech clusters, encouraging them to cohere with the distributional features. The resulting algorithm outperforms competitive alternatives on English POS induction.
0 Replies

Loading