Keywords: Embedding Models, Steering Vectors, High Dimensional Geometry, LLMs, Unit Hypersphere
TL;DR: Rotor-Invariant Shift Estimation (RISE), a geometric framework that enables cross-lingual and cross-model semantic transfer with improved performance on transformer-based embeddings.
Abstract: Understanding how language and embedding models encode semantic relationships is fundamental to model interpretability.
While early word embeddings exhibited intuitive vector arithmetic (''king'' - ''man'' + ''woman'' = ''queen''), modern high-dimensional text representations lack straightforward interpretable geometric properties.
We introduce Rotor-Invariant Shift Estimation (RISE), a geometric approach that represents semantic-syntactic transformations as consistent rotational operations in embedding space, leveraging the manifold structure of modern language representations.
RISE operations have the ability to operate across both languages and models without reducing performance, suggesting the existence of analogous cross-lingual geometric structure.
We compare and evaluate RISE using two baseline methods, three embedding models, three datasets, and seven morphologically diverse languages in five major language groups.
Our results demonstrate that RISE consistently maps discourse-level semantic-syntactic transformations with distinct grammatical features (e.g., negation and conditionality) across languages and models.
This work provides the first demonstration that discourse-level semantic-syntactic transformations correspond to consistent geometric operations in multilingual embedding spaces, empirically supporting the linear representation hypothesis at the sentence level.
Primary Area: interpretability and explainable AI
Submission Number: 8045
Loading