Speeding Up EntmaxDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=-cjifzq0diz
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Softmax is the de facto standard for normalizing logits in modern neural networks for language processing. However, by producing a dense probability distribution each token in the vocabulary has a nonzero chance of being selected at each generation step, leading to a variety of reported problems in text generation. $\alpha$-entmax of Peters et al. (2019) solves this problem, but is unfortunately slower than softmax. In this paper, we propose an alternative to $\alpha$-entmax, which keeps its virtuous characteristics, but is as fast as optimized softmax and achieves on par or better performance in machine translation task.
Copyright Consent Signature (type Name Or NA If Not Transferrable): Maxat Tezekbayev
Copyright Consent Name And Address: Nazarbayev University Kazakhstan, Nur-Sultan, Kabanbay Batyr ave. 53; NAVER LABS Europe 6-8 chemin de Maupertuis 38240 Meylan, France
0 Replies

Loading