Word2Fun: Modelling Words as Functions for Diachronic Word Representation

Benyou Wang; Emanuele Di Buccio; Massimo Melucci

Word2Fun: Modelling Words as Functions for Diachronic Word Representation

Benyou Wang, Emanuele Di Buccio, Massimo Melucci

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: Diachronic Word Representation, word vectors, Temporal Analogy, embedding

Abstract: Word meaning may change over time as a reflection of changes in human society. Therefore, modeling time in word representation is necessary for some diachronic tasks. Most existing diachronic word representation approaches train the embeddings separately for each pre-grouped time-stamped corpus and align these embeddings, e.g., by orthogonal projections, vector initialization, temporal referencing, and compass. However, not only does word meaning change in a short time, word meaning may also be subject to evolution over long timespans, thus resulting in a unified continuous process. A recent approach called `DiffTime' models semantic evolution as functions parameterized by multiple-layer nonlinear neural networks over time. In this paper, we will carry on this line of work by learning explicit functions over time for each word. Our approach, called `Word2Fun', reduces the space complexity from $\mathcal{O}(TVD)$ to $\mathcal{O}(kVD)$ where $k$ is a small constant ($k \ll T $). In particular, a specific instance based on polynomial functions could provably approximate any function modeling word evolution with a given negligible error thanks to the Weierstrass Approximation Theorem. The effectiveness of the proposed approach is evaluated in diverse tasks including time-aware word clustering, temporal analogy, and semantic change detection. Code at: {\url{https://github.com/wabyking/Word2Fun.git}}.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: A new paradigam for time-specific word vectors

Code: https://github.com/wabyking/Word2Fun.git

15 Replies

Loading