Unsupervised Information Extraction with Distributional Prior Knowledge

Cane Wing-ki Leung, Jing Jiang, Kian Ming Adam Chai, Hai Leong Chieu, Loo-Nin Teow

2011 (modified: 10 Nov 2022)EMNLP 2011Readers: Everyone

Abstract: We address the task of automatic discovery of information extraction template from a given text collection. Our approach clusters candidate slot fillers to identify meaningful template slots. We propose a generative model that incorporates distributional prior knowledge to help distribute candidates in a document into appropriate slots. Empirical results suggest that the proposed prior can bring substantial improvements to our task as compared to a K-means baseline and a Gaussian mixture model baseline. Specifically, the proposed prior has shown to be effective when coupled with discriminative features of the candidates.

0 Replies