Novel positional encodings to enable tree-structured transformers

Vighnesh Leonardo Shiv, Chris Quirk

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: With interest in program synthesis and similarly flavored problems rapidly increasing, neural models optimized for tree-domain problems are of great value. In the sequence domain, transformers can learn relationships across arbitrary pairs of positions with less bias than recurrent models. Under the intuition that a similar property would be beneficial in the tree domain, we propose a method to extend transformers to tree-structured inputs and/or outputs. Our approach abstracts transformer's default sinusoidal positional encodings, allowing us to substitute in a novel custom positional encoding scheme that represents node positions within a tree. We evaluated our model in tree-to-tree program translation and sequence-to-tree semantic parsing settings, achieving superior performance over the vanilla transformer model on several tasks.
  • TL;DR: We develop novel positional encodings for tree-structured data, enabling transformers to be applied to tree structured problems.
  • Keywords: program translation, tree structures, transformer
0 Replies