Neuro-Symbolic Language Modeling with Automaton-augmented RetrievalDownload PDF

09 Jun 2022 (modified: 22 Oct 2023)ICML 2022 Workshop KRLM Readers: Everyone
Keywords: nearest, neighbors, k-nearest neighbors, language, models, automata, automaton, retomaton
TL;DR: We construct a weighted automaton from the training data of a given LM, and traverse it at inference time, in parallel with the LM.
Abstract: Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton – retrieval automaton – which approximates the datastore search, based on (1) clustering of entries into “states”, and (2) state transitions from previous entries. This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity, or alternatively saves up to 83% of the nearest neighbor searches over kNN-LM (Khandelwal et al., 2020), without hurting perplexity. Our code and trained models are available at https://github.com/neulab/retomaton . This is a workshop version of the longer paper that appeared in ICML'2022 (Alon et al., 2022).
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/arxiv:2201.12431/code)
0 Replies

Loading