Nucleus Beam Search for Machine Translation Decoding

Published: 01 Jan 2023, Last Modified: 06 Jun 2025ICIC (4) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Beam search is the most widely-used decoding algorithm for machine translation. Its success, however, may be attributed to the inadvertent implementation of the Uniform Information Density (UID) hypothesis. The UID hypothesis suggests that humans prefer sentences with evenly distributed information across the linguistic signal, while adhering to grammatical constraints. This paper presents Nucleus Beam Search, a novel machine translation decoding algorithm aimed at achieving the UID objective. By combining nucleus filtering with beam search, our approach effectively expands the search space without violating the UID hypothesis, enabling the generation of lengthier and more com prehensive translations. Experimental results reveal that Nucleus Beam Search outperforms traditional decoding algorithms in terms of BLEU, METEOR, ROUGE-L and CIDEr scores. Nevertheless, our findings also suggest that information density is not the sole determinant of translation quality, with beamwidth playing a significant role as well.
Loading