LoPE: Learnable Sinusoidal Positional Encoding for Improving Document Transformer ModelDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: Positional encoding plays a key role in Transformer-based architecture, which is to indicate and embed token sequential order information. Understanding documents with unreliable reading order information is a real challenge for document Transformer model. This paper proposes a new and generic positional encoding method, learnable sinusoidal positional encoding (LoPE), by combining sinusoidal positional encoding function and a learnable feed-forward network. We apply LoPE to document Transformer model and pretrain the model on document datasets. Then we finetune and evaluate the model performance on document understanding tasks in form and receipt domains. Experimental results not only show our proposed method outperforms other baselines and state-of-the-arts, but also demonstrate its robustness and stability on handling noisy data with incorrect order information.
Paper Type: long
0 Replies

Loading