Improving Cross-Lingual Transfer for Open Information Extraction with Linguistic Feature Projection

Anonymous

Improving Cross-Lingual Transfer for Open Information Extraction with Linguistic Feature Projection

Anonymous

17 Feb 2023 (modified: 05 May 2023)ACL ARR 2023 February Blind SubmissionReaders: Everyone

Abstract: Open Information Extraction (OpenIE) structures information from natural language text in the form of (subject, predicate, object) triples. Supervised OpenIE is in principle only possible for English, for which plenty of labeled data exists. Recent research efforts tackled multilingual OpenIE by means of zero-shot transfer from English, with massively multilingual language models as vehicles of transfer. Given that OpenIE is a highly syntactic task, such transfer is bound to fail for languages that are syntactically more complex and distant from English. In this work, we verify this for Japanese, for which the state-of-the-art OpenIE transfer approach yields near-zero performance. We next propose three Linguistic Feature Projection strategies, which lead to training data that contains features of both the source (English) and target (Japanese) language, namely (i) reordering of words in source-language utterances to match the target language word order (RO), (ii) code-switching (CS), and (iii) insertion of Japanese case markers into English utterances (CM). Experiments, on a newly constructed Japanese OpenIE benchmark, render all three strategies effective and mutually complementary. Further, we show that RO and CS, as target language-agnostic strategies, also lead to gains in transfer to German, a language syntactically closer to English as the source.

Paper Type: long

Research Area: Information Extraction

0 Replies

Loading