Language Modeling Using Tensor TrainsDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: Tensor network, RNNs, Language modeling
Abstract: Tensor networks have previously been shown to have potential in language modelling in theory, but lack of practical evidence support. We propose a novel Tensor Train Language Model (TTLM) based on Tensor-Train decomposition. We prove that TTLM generalizes Second-order Recurrent Neural Networks (RNNs), Recurrent Arithmetic Circuits and Multiplicative Integration RNNs in the sense that the architecture of all of these are, essentially, special cases of that of TTLM. To show the usefulness of TTLM, we perform a principled experimental evaluation on language modeling tasks, showing that our proposed variants, TTLM-large and TTLM-Tiny, can be more effective than Vanilla RNN while TTLM-Tiny has the half of the model size.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
11 Replies

Loading