Abstract: Modeling tasks often take inputs from languages including programming languages and natural language. Many such tasks involve learning functions which are invariant to certain types of input transformations. In this work we consider a specific class of invariance: semantics-preserving variable renaming. We first show that transformer networks trained on such tasks do not always mirror the invariance of the underlying function. In this work we propose Renamer, a transformer architecture which is invariant to semantics-preserving variable renaming. Renamer improves over a vanilla transformer by between a 24.79% to 52.80% reduction in error on a case study on learning a surrogate of a large-scale CPU simualtor. Furthermore, the invariant network does not experience the same sensitivity to variable renaming, and its error remains constant when evaluated on a variable renamed version of the test set. Finally, the invariant network is more efficient to train, and matches the best error of the vanilla network with a between 25.15% to 60.00% reduction in training epochs.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
8 Replies
Loading