Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Published: 28 Feb 2025, Last Modified: 17 Apr 2025WRL@ICLR 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Track: full paper
Keywords: position encodings for Transformers, RoPE, Robotics, translation-invariance, Lie algebra
TL;DR: This paper introduces STRING: a general class of separable translation-invariant position encodings for Transformers (extending Rotary Position Encodings), and applies them in a wide range of 2D and 3D scenarios, in particular in Robotics.
Abstract: We introduce $\textbf{STRING}$: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint. These properties are especially important in robotics, where efficient 3D token representation is key. We integrate STRING into Vision Transformers with RGB(-D) inputs (color plus optional depth), showing substantial gains, e.g. in open-vocabulary object detection and for robotics controllers. We complement our experiments with a rigorous mathematical analysis, proving the universality of our methods. Videos of STRING-based robotics controllers can be found here: https://sites.google.com/view/string-robotics.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Format: Yes, the presenting author will definitely attend in person because they are attending ICLR for other complementary reasons.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 76
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview