Efficient Model Configuration for Transformers: Improving Translation Quality with Reduced Parameters and Comparable Inference Speed
Abstract: In recent developments, Transformers have emerged as the leading performers in a range of natural language processing tasks, including the challenging domain of machine translation. Nonetheless, traditional Transformers have encountered a significant obstacle in the form of high inference costs. This paper addresses this issue by investigating the influence of various model hyperparameters on the architecture of Transformers, focusing on their impact on both translation quality and inference speed. Our research findings lead us to propose an optimized model configuration, which surpasses standard efficient vanilla Transformers by achieving a 1-point increase in BLEU score, while utilizing fewer parameters and maintaining identical inference speed when running on a CPU.
Paper Type: long
Research Area: Machine Translation
Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency
Languages Studied: German, English
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading