Parameter-Efficient Transfer Learning for End-to-end Speech Translation

Published: 01 Jan 2024, Last Modified: 03 Oct 2024LREC/COLING 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, end-to-end speech translation (ST) has gained significant attention in research, but its progress is hindered by the limited availability of labeled data. To overcome this challenge, leveraging pre-trained models for knowledge transfer in ST has emerged as a promising direction. In this paper, we propose PETL-ST, which investigates parameter-efficient transfer learning for end-to-end speech translation. Our method utilizes two lightweight adaptation techniques, namely prefix and adapter, to modulate Attention and the Feed-Forward Network, respectively, while preserving the capabilities of pre-trained models. We conduct experiments on MuST-C En-De, Es, Fr, Ru datasets to evaluate the performance of our approach. The results demonstrate that PETL-ST outperforms strong baselines, achieving superior translation quality with high parameter efficiency. Moreover, our method exhibits remarkable data efficiency and significantly improves performance in low-resource settings.
Loading