Language Modeling Using Tensor Trains

Zhan Su; Yuqin Zhou; Benyou Wang; Qiuchi Li; Jakob Grue Simonsen

Language Modeling Using Tensor Trains

Zhan Su, Yuqin Zhou, Benyou Wang, Qiuchi Li, Jakob Grue Simonsen

Published: 01 Feb 2023, Last Modified: 12 Oct 2025Submitted to ICLR 2023Readers: Everyone

Keywords: Tensor network, RNNs, Language modeling

Abstract: Tensor networks have previously been shown to have potential in language modelling in theory, but lack of practical evidence support. We propose a novel Tensor Train Language Model (TTLM) based on Tensor-Train decomposition. We prove that TTLM generalizes Second-order Recurrent Neural Networks (RNNs), Recurrent Arithmetic Circuits and Multiplicative Integration RNNs in the sense that the architecture of all of these are, essentially, special cases of that of TTLM. To show the usefulness of TTLM, we perform a principled experimental evaluation on language modeling tasks, showing that our proposed variants, TTLM-large and TTLM-Tiny, can be more effective than Vanilla RNN while TTLM-Tiny has the half of the model size.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/language-modeling-using-tensor-trains/code)

11 Replies

Loading