Multi-Field Information Extraction and Cross-Document FusionDownload PDF

2005 (modified: 16 Jul 2019)ACL 2005Readers: Everyone
Abstract: In this paper, we examine the task of extracting a set of biographic facts about target individuals from a collection of Web pages. We automatically annotate training text with positive and negative examples of fact extractions and train Rote, Naive Bayes, and Conditional Random Field extraction models for fact extraction from individual Web pages. We then propose and evaluate methods for fusing the extracted information across documents to return a consensus answer. A novel cross-field bootstrapping method leverages data interdependencies to yield improved performance.
0 Replies

Loading