Cross-lingual Transfer Learning for Intent Detection of Covid-19 UtterancesDownload PDF

Aug 12, 2020 (edited Sep 04, 2020)EMNLP 2020 Workshop NLP-COVID SubmissionReaders: Everyone
  • Keywords: NLP, Covid, xlm, bert, cross-lingual, transfer learning, elmo, muse
  • Abstract: In times of a global pandemic, interactive chat bots are an indispensable tool to provide information to people. With this motivation, we study the problem of intent detection of user utterances, which is usually the first language understanding step in such systems. Specifically, we focus on cross-lingual transfer learning for intent detection of user utterances and zero-shot learning for code-switched (CS) utterances. We release a multilingual dataset, M-CID, containing 6871 utterances across English, Spanish, French, German and Spanglish (Spanish + English). We use this dataset to explore some cross-lingual transfer learning techniques to study: (1) monolingual and multilingual model baselines, (2) cross-lingual transfer from English to Spanish, French and German, and (3) zero-shot code-switching for Spanglish. In our experiments, we observe that XLM-R models are able to significantly outperform cross lingual word embedding techniques for all of the above settings. We also show that it is possible to obtain a strong performance on code-switched data by only using monolingual data from substrate languages.
6 Replies