Abstract: It is well-known that typical word embedding methods have the property that the meaning can be composed by adding up the embeddings (additive compositionality). Several theories have been proposed to explain additive compositionality, but the following problems remain: (i) The assumptions of those theories do not hold for the practical word embedding. (ii) Ordinary additive compositionality can be seen as an AND operation of word meanings, but it is not well understood how other operations, such as OR and NOT, can be computed by the embeddings. We address these issues by the idea of frequency-weighted centering at its core. This method bridges the gap between practical word embedding and the assumption of theory about additive compositionality as an answer to (i). This paper also gives a method for taking OR or NOT of the meaning by linear operation of word embedding as an answer to (ii). Moreover, we confirm experimentally that the accuracy of AND operation, i.e., the ordinary additive compositionality, can be improved by our post-processing method (3.5x improvement in top-100 accuracy) and that OR and NOT operations can be performed correctly. We also confirm that the proposed method is effective for BERT.
Paper Type: long
0 Replies
Loading