Error-Correcting Codes For Approximate Neural Sequence PredictionDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: We propose a novel neural sequence prediction method based on \textit{error-correcting codes} that avoids exact softmax normalization and allows for a tradeoff between speed and performance. Error-correcting codes represent predictions and targets as a binary code where each bit is represented by a logit. The codebook is arranged such that similar tokens are close to each other using word embedding similarity, ensuring that incorrect predictions are at least semantically close to the target. We also address the well-established problem of compounding errors by mixing the latent codes of past predictions and past targets in one of two ways: (1) according to a predefined sampling schedule or (2) a differentiable sampling procedure that replaces the argmax operation. Low dimensional codes show similar performance to models that use the full softmax and outperform alternative approximate methods for language modeling and text generation, while generation further benefits from our mixture sampling.
0 Replies

Loading