Towards a Unified Training for Levenshtein Transformer

Published: 01 Jan 2023, Last Modified: 17 Apr 2025ICASSP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Levenshtein Transformer (LevT) is a widely-used text-editing model, which generates a sequence based on editing operations (deletion and insertion) in a non-autoregressive manner. However, it is challenging to train the key refinement components of LevT due to training-inference discrepancy. By carefully designing experiments, our work reveals that the deletion module is under-trained while the insertion module is over-trained due to the imbalance training signals for the two refinement modules. Based on these observations, we further propose a dual learning approach that can remedy the imbalance training by feeding an initial input to both refinement modules, which is consistent with the process in inference. Experimental results on three representative NLP tasks demonstrate the effectiveness and universality of the proposed approach. 1
Loading