The Dynamic Interaction Field Transformer: A Universal, Tokenizer-Free Language Architecture

The Dynamic Interaction Field Transformer: A Universal, Tokenizer-Free Language Architecture

ICLR 2026 Conference Submission20670 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: language modeling, transformer architectures, tokenizer-free models, continuous representations, self-attention alternatives, hybrid models, byte-level processing

TL;DR: We introduce DIFT, a novel tokenizer-free language architecture that uses hierarchical byte aggregation and continuous interaction fields to achieve competitive performance with BERT on GLUE.

Abstract: Standard language models are limited by their reliance on subword tokenization, which introduces cross-lingual disparities and struggles with morphologically rich languages and out-of-domain text. Additionally, the opaque, all-to-all self-attention mechanisms employed by these models hinder their interpretability. In this work, we propose a novel theoretical framework that models language as a \textbf{hybrid discrete-continuous system}, addressing both of these challenges. We introduce the \textbf{Dynamic Interaction Field Transformer (DIFT)}, the first tokenizer-free transformer architecture that achieves word-level computational efficiency and competitive performance on standard benchmarks. DIFT operates directly on raw bytes, utilizing \textbf{hierarchical aggregation} to learn word representations from scratch. To model context, DIFT replaces traditional self-attention with an interpretable \textbf{continuous interaction field}, where concepts influence each other through proximity in a shared semantic space. We demonstrate that DIFT, when trained from scratch, achieves competitive performance with BERT-base on GLUE. Its tokenizer-free design also enhances robustness and enables superior zero-shot multilingual performance, effectively overcoming the core limitations of subword-based models. DIFT offers a new direction for more robust, interpretable, and theoretically grounded language architectures.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 20670

Loading