Keywords: knowledge representation, word embeddings, sentence embeddings, common-sense knowledge
TL;DR: This paper presents a paradigm and methodology for using learned sentence representations as emergent, flexible knowledge bases that can be queried using linear algebra.
Abstract: Many applications of linguistic embedding models rely on their value as pre-trained inputs for end-to-end tasks such as dialog modeling, machine translation, or question answering. This position paper presents an alternate paradigm: Rather than using learned embeddings as input features, we instead treat them as a common-sense knowledge repository that can be queried via simple mathematical operations within the embedding space. We show how linear offsets can be used to (a) identify an object given its description, (b) discover relations of an object given its label, and (c) map free-form text to a set of action primitives. Our experiments provide a valuable proof of concept that language-informed common sense reasoning, or `reasoning in the linguistic domain', lies within the grasp of the research community. In order to attain this goal, however, we must reconsider the way neural embedding models are typically trained an evaluated. To that end, we also identify three empirically-motivated evaluation metrics for use in the training of future embedding models.
Original Pdf: pdf
4 Replies
Loading