Revisiting the Iterative Non-Autoregressive Transformer

Revisiting the Iterative Non-Autoregressive Transformer

ACL ARR 2024 April Submission538 Authors

16 Apr 2024 (modified: 23 May 2024)ACL ARR 2024 April SubmissionEveryone, Ethics ReviewersRevisionsBibTeXCC BY 4.0

Abstract: Iterative non-autoregressive (NAR) models share a spirit of mixed autoregressive (AR) and fully NAR models, seeking a balance between generation quality and inference efficiency. These models have recently demonstrated impressive performance in varied generation tasks, surpassing the autoregressive (AR) Transformer. However, they also face several challenges that impede further development. In this work, we target building more efficient and competitive iterative NAR models by conducting systematic studies and analytical experiments. Firstly, we conduct an oracle experiment and introduce two newly proposed metrics to identify the potential problems existing in current refinement processes, and look back on the various iterative NAR models to find the key factors for realizing our purpose. Subsequently, based on the analyses of the limitations of previous inference algorithms, we propose a simple yet effective strategy to conduct efficient refinements without performance declines. Experiments on five widely used datasets show that our final models significantly outperform all previous NAR models and AR Transformer, even with fewer decoding steps on two datasets.

Paper Type: Long

Research Area: Generation

Research Area Keywords: Generation, Machine Translation, Language Modeling

Contribution Types: Model analysis & interpretability, Reproduction study, Approaches low compute settings-efficiency

Languages Studied: English, German, Romania

Submission Number: 538

Loading