- Keywords: Textual relation embedding, Relation extraction, Pre-trained embedding, Knowledge base completion
- Abstract: Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks. In this work, we investigate how to learn a general-purpose embedding of textual relations, defined as the shortest dependency path between entities. Textual relation embedding provides a level of knowledge between word/phrase level and sentence level. We show that it can facilitate downstream tasks requiring relational understanding of text. To learn such an embedding, we create the largest distant supervision dataset by linking the entire English ClueWeb09 corpus to Freebase. Using the global co-occurrence statistics between textual and knowledge base relations as supervision signal, we learn the embedding of textual relations with the Transformer model. We conduct intrinsic and extrinsic evaluation on two representative downstream tasks requiring relational understanding, and demonstrate that the learned textual relation embedding serves as a good prior for these tasks and boosts their performance. Our code and pre-trained model can be found at https://github.com/anonymous-repo.
- Archival status: Non-Archival
- Subject areas: Information Extraction