Abstract: Recently, structured state space sequence (S4) models have generated considerable interest due to their simplicity and favorable performance compared to the transformer architecture in certain sequence modeling tasks. A very important property that distinguishes these models from traditional gated RNNs is the linear dependence of the model output on the latent space vector at each time step, even when an input dependent selection mechanism is incorporated. This means that the computation underlying inference and sequence mapping in these models involves linear time evolution of the latent space vector. Inspired by long standing studies of time evolution of matrix product states in quantum mechanics, we study the problem of compressing the latent space of sequence models using tensorization methods. Such tensorized sequence models, we call TS4. Various novel structures on the parameters of S4 models within the tensorization setting are imposed to propose new classes of structured sequence models.
Paper Type: Short
Research Area: Machine Learning for NLP
Research Area Keywords: generative models, word embeddings, representation learning
Languages Studied: English
Submission Number: 6441
Loading