Keywords: Transformer Extrapolation;Combinatorial Optimization;
Abstract: The inherent characteristics of the Transformer enable us to train on shorter datasets and extrapolate to testing on longer ones directly. Numerous researchers in the realm of natural language processing have proposed a variety of methods for length extrapolation, the majority of which involve position embeddings. Nonetheless, in combinatorial optimization problems, Transformers are devoid of position embeddings. We aspire to achieve successful length extrapolation in combinatorial optimization problems as well. As such, We propose an entropy invariant extrapolation method (EIE), which obviates the need for positional embeddings and employs varying scale factors according to different lengths. Our approach eliminates the need for retraining, setting it apart from prior work. Results on multiple combinatorial optimization datasets demonstrate that our method surpasses existing ones.
Supplementary Material: zip
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4834
Loading