A Structural Transformer with Relative Positions in Trees for Code-to-Sequence Tasks

Anonymous

04 Jun 2020 (modified: 13 Dec 2020)OpenReview Anonymous Preprint Blind SubmissionReaders: Everyone

Keywords: machine learning, nlp, transformer, tree, structural, seq2seq, code2seq, source code, code, AST, abstract syntax tree, code summarization, machine translation

TL;DR: We incorporate tree-structured network topologies into transformers by enhancing self-attention with relative position representations to consider relative movements between nodes and outperform the SoTA on several code-to-sequence tasks by up to 6%.

Abstract: We suggest two approaches to incorporate syntactic information into transformer models encoding trees (e.g. abstract syntax trees) and generating sequences. First, we use self-attention with relative position representations to consider structural relationships between nodes using a representation that encodes movements between any pair of nodes in the tree, and demonstrate how those movements can be computed efficiently on the fly. Second, we suggest an auxiliary loss enforcing the network to predict the lowest common ancestor of node pairs. We apply both methods to source code summarization tasks, where we outperform the state-of-the-art by up to 6% F1. On natural language machine translation, our models yield competitive results, while substantially faster than other. We also consistently outperform sequence-based transformers, and demonstrate that our method yields representations that are more closely aligned with the tree's structure.

0 Replies