<div align="center">

# State space models can express n-gram languages
</div>

Official implementation of [State space models can express n-gram languages]() (TMLR 2025).

## Abstract

Recent advancements in recurrent neural networks (RNNs) have reinvigorated interest in their
application to natural language processing tasks, particularly with the development of more
efficient and parallelizable variants known as state space models (SSMs), which have shown
competitive performance against transformer models while maintaining a lower memory
footprint. While RNNs and SSMs (e.g., Mamba) have been empirically more successful than
rule-based systems based on n-gram models, a rigorous theoretical explanation for this success has not yet been developed, as it is unclear how these models encode the combinatorial rules that govern the next-word prediction task. In this paper, we construct state space language models that can solve the next-word prediction task for languages generated from n-gram rules, thereby showing that the former are more expressive. Our proof shows how SSMs can encode n-gram rules using new theoretical results on their memorization capacity, and demonstrates how their context window can be controlled by restricting the spectrum of the state transition matrix. We conduct experiments with a small dataset generated from n-gram rules to show how our framework can be applied to SSMs and RNNs obtained through gradient-based optimization.
<p float="left" align="center">
<img src="ssm_ngram_image.png" width="400" />
**Figure.** A framework for encoding n-gram rules with state space models. </p>

## Requirements
- This codebase is written for `python3` and 'pytorch'.

## Experiments
### Data

The synthetic data is created in the "dataloader.py" file. To create synthetic datasets using this template, modify the "paths" and "classes" variables. 

### Training


To train the state space model and evaluate its results:

```
!python main.py
```

To generate visualizations, which will be stored in the "Plots" folder: 

```
!python visualize.py
```

## License and Contributing
- This README is formatted based on [paperswithcode](https://github.com/paperswithcode/releasing-research-code).
- Feel free to post issues via Github. Alternatively, please feel free to [contact us](mailto:vinoth.90@gmail.com) if you have any questions.

## Reference
If you find the code useful in your research, please consider citing our paper:

<pre>
@article{SSM-ngrams,
  title = {State Space Models Can Express n-Gram Languages},
  author = {Nandakumar, Vinoth and Qu, Qiang and Mi, Peng and Liu, Tongliang},
  journal = {Transactions of Machine Learning Research},
  year = {2025} }
</pre>
