ASR Free End-to-End Spoken Language Understanding using Transformers

Karan Purohit; Deepesh Bhageria; Vishwa Nath Jha

ASR Free End-to-End Spoken Language Understanding using Transformers

Karan Purohit, Deepesh Bhageria, Vishwa Nath Jha

10 Jun 2020 (modified: 05 May 2023)Submitted to SAS 2020Readers: Everyone

Keywords: Transformer, Spoken Language Understanding

TL;DR: We show that how transformer-based architecture can be used for building end to end SLU system

Abstract: End-to-end spoken language understanding (SLU) systems directly map speech to intent through a single trainable model whereas conventional SLU systems use Automatic Speech Recognition (ASR) to convert speech to text and utilize Natural Language Understanding (NLU) to get intent. In this paper, we show how transformer-based architecture can be used for building end to end SLU systems. We conducted experiments on the Fluent Speech Commands (FSC) dataset, where intents are formed as combinations of three slots namely action, object, and location. We also demonstrate how state-of-the-art results can be obtained using a combination of various data augmentation methods.

Double Submission: No

4 Replies

Loading