Novel positional encodings to enable tree-structured transformers

Vighnesh Leonardo Shiv; Chris Quirk

Novel positional encodings to enable tree-structured transformers

Vighnesh Leonardo Shiv, Chris Quirk

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: With interest in program synthesis and similarly ﬂavored problems rapidly increasing, neural models optimized for tree-domain problems are of great value. In the sequence domain, transformers can learn relationships across arbitrary pairs of positions with less bias than recurrent models. Under the intuition that a similar property would be beneficial in the tree domain, we propose a method to extend transformers to tree-structured inputs and/or outputs. Our approach abstracts transformer's default sinusoidal positional encodings, allowing us to substitute in a novel custom positional encoding scheme that represents node positions within a tree. We evaluated our model in tree-to-tree program translation and sequence-to-tree semantic parsing settings, achieving superior performance over the vanilla transformer model on several tasks.

TL;DR: We develop novel positional encodings for tree-structured data, enabling transformers to be applied to tree structured problems.

Keywords: program translation, tree structures, transformer

11 Replies

Loading