Abstract: Fine-tuning foundation models pre-trained on large-scale data is a promising approach for low-resource scenarios. However, the performance is heavily dependent on the amount of pre-training data available for the target language, and fine-tuning with limited data can lead to overfitting and may have some rooms to optimize. To address these challenges, we propose task vector-based adaptation to improve the performance of automatic speech recognition (ASR) in low-resource languages. Task vectors offer a flexible method for effective model adjustments without requiring retraining, making them particularly useful in low-resource scenarios. Recent works have demonstrated their effectiveness for domain adaptation in ASR tasks, but they have been limited within the same language. In our experiments, we explore various combinations of task vectors and their scaling factors for ASR in low-resource languages. Our experimental results reveal that using task vectors from the same language family can achieve better performance than naive fine-tuning with limited data, without the need for additional parameters or data expansion.
External IDs:dblp:conf/icassp/NagasawaOI25
Loading