Phylogeny-Inspired Soft Prompts For Data-to-Text Generation in Low-Resource Languages

William Eduardo Soto Martinez, Yannick Parmentier, Claire Gardent

11 Oct 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Most work on verbalising Knowledge-Graphs (KG) has focused on high-resource languages such as English, Russian, Czech or Arabic. In this paper, we focus on KG-to-Text generation where the output text is in Breton, Irish or Welsh. To overcome the small size of the parallel training data, we combine the strengths of a multilingual encoder-decoder model with denoising fine-tuning on monolingual data and Soft Prompt fine-tuning on a small quantity of KG/text data. We furthermore structure the soft prompt into multiple sub-prompts designed to capture the similarities and differences between English, Knowledge graphs and the three target languages. Our experiments show that our approach outperforms strong baselines and that all sub-prompts contribute to performance

0 Replies