Exploring Cross-Lingual Transfer to Counteract Data Scarcity for Causality DetectionDownload PDFOpen Website

2022 (modified: 05 Oct 2022)WWW (Companion Volume) 2022Readers: Everyone
Abstract: Finding causal relations in text is an important task for many types of textual analysis. It is a challenging task, especially for the many languages with no or only little annotated training data available. To overcome this issue, we explore cross-lingual methods. Our main focus is on Swedish, for which we have a limited amount of data, and where we explore transfer from English and German. We also present additional results for German with English as a source language. We explore both a zero-shot setting without any target training data, and a few-shot setting with a small amount of target data. An additional challenge is the fact that the annotation schemes for the different data sets differ, and we discuss how we can address this issue. Moreover, we explore the impact of different types of sentence representations. We find that we have the best results for Swedish with German as a source language, for which we have a rather small but compatible data set. We are able to take advantage of a limited amount of noisy Swedish training data, but only if we balance its classes. In addition we find that the newer transformer-based representations can make better use of target language data, but that a representation based on recurrent neural networks is surprisingly competitive in the zero-shot setting.
0 Replies

Loading