Low resource neural machine translation model optimization based on semantic confidence weighted alignment

Published: 2024, Last Modified: 14 Jan 2026Int. J. Mach. Learn. Cybern. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The performance of neural machine translation models based on the Transformer architecture is contingent upon the quality of the data. When the training data contains a high proportion of noise, the performance of the model deteriorates. This paper addresses the issue of diminished model capability in the presence of noisy datasets by proposing an optimization method based on semantic confidence-weighted alignment. This method integrates alignment metrics and model parameter confidence adjustments to recalibrate loss weights, thereby enhancing the model’s ability to identify and process noisy data. Experimental results demonstrate that this approach significantly improves the performance of translation models, particularly in low-resource language pairs such as Malay-Chinese, especially when dealing with noisy datasets. Compared to traditional methods, there is a notable increase in BLEU scores.
Loading