Zero-shot Entity Extraction from Web Pages

Panupong Pasupat, Percy Liang

2014 (modified: 16 Jul 2019)ACL (1) 2014Readers: Everyone

Abstract: In order to extract entities of a fine-grained category from semi-structured data in web pages, existing information extraction systems rely on seed examples or redundancy across multiple web pages. In this paper, we consider a new zero-shot learning task of extracting entities specified by a natural language query (in place of seeds) given only a single web page. Our approach defines a log-linear model over latent extraction predicates, which select lists of entities from the web page. The main challenge is to define features on widely varying candidate entity lists. We tackle this by

0 Replies