Parallel Temporal Encoder For Sign Language TranslationDownload PDFOpen Website

2019 (modified: 25 Mar 2022)ICIP 2019Readers: Everyone
Abstract: This paper addresses the sign video interpretation which is a weakly supervised task. Each sign action in videos lacks exact boundaries or labels. We design a Parallel Temporal Encoder (PTEnc) to learn the temporal relation of a sign video from local and global sequential learning views in parallel. PTEnc utilizes the complementarity between the local and global temporal cues. Then, fused encoded feature sequence is fed into a Connectionist Temporal Classification (CTC) based sentence decoder. In addition, in order to enhance the temporal cues in each video, we introduce a reconstruction loss, which performs in an unsupervised way without additional labels. The CTC loss cooperates with the reconstruction loss in an end-to-end training manner. Experimental results on a benchmark dataset demonstrate the effectiveness of the proposed method.
0 Replies

Loading