DXA-Net: Dual-Task Cross-Lingual Alignment Network for Zero-Shot Cross-Lingual Spoken Language Understanding
Abstract: The state-of-the-art zero-shot cross-lingual spoken language understanding (SLU) model utilizes cross-lingual unsupervised contrastive learning to achieve multilingual semantics alignment. While existing methods have achieved promising results, they still have two issues limiting cross-lingual knowledge transfer: (1) dual-task correlative knowledge is not explicitly modeled and transferred to target languages; (2) the semantics differences among samples are ignored, and the contrastive semantics knowledge is not transferred to target languages. In this paper, we propose a dual-task cross-lingual alignment network (DXA-Net), which makes the first attempt to tackle zero-shot cross-lingual SLU based on the prompt-tuning paradigm. To solve the first issue, we propose the co-guiding prompt, which allows the model to conditionally generate one task’s label based on another one’s. To solve the second issue, we propose the intent/slot contrastive prompt to teach the model to discriminate whether a pair of samples have the same or similar labels. Additionally, we propose multilingual semantics contrastive prompt to enhance multilingual semantics alignment. Experiments on the benchmark show that our model achieves new state-of-the-art performance on nine languages.
External IDs:doi:10.1109/tpami.2025.3597726
Loading