Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: embedding methods, interchangeable tokens, extendable vocabulary, transformer models, linear temporal logic, formal reasoning, token generalization, alpha-equivalence, inductive bias, language models, symbolic representation, neural networks
TL;DR: Our novel approach for learning interchangeable tokens generalizes to larger vocabulary sizes as demonstrated in linear temporal logic.
Abstract: We propose a novel approach for learning interchangeable tokens in language models to obtain an extendable vocabulary that can generalize to new tokens. Our method addresses alpha-equivalence, the principle that renaming bound variables preserves semantics. This property arises in many formal languages such as temporal logics, where all proposition symbols represent the same concept but remain distinct. To handle such tokens, we develop a dual-part embedding approach. The first part is shared across all interchangeable tokens, enforcing that they represent the same core concept. The second part is randomly generated for each token, enabling distinguishability. As a baseline, we consider a simpler approach that uses alpha-renaming for data augmentation. We also present alpha-covariance, a metric for measuring robustness against alpha-conversions. When evaluated in a Transformer encoder-decoder model for solving linear temporal logic formulae and copying with extendable vocabulary, our method demonstrates promising generalization capabilities as well as a favorable inductive bias for alpha-equivalence.
Supplementary Material: zip
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11737
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview