From HiPPO to H3: Equipping State Space Models for Language

03 Feb 2023 (modified: 02 May 2023)Submitted to Blogposts @ ICLR 2023Readers: Everyone
Keywords: State space models, S4, H3, Hippo, language, sequence, long-range
Abstract: Techniques like the Structured State Space Sequence (S4) model and its variants have made gains in sequence modeling recently, especially on benchmarks for long sequences. These state space models’ (SSM) capacities for long-range memory are underpinned by a function approximation framework called High-order Polynomial Projection Operators (HiPPO). The H3, or Hungry Hungry Hippos, layer, an SSM-based layer that shows promise on language tasks, builds on this same HiPPO framework. This blog post provides background (with code) on the HiPPO framework from which H3 gets its name, as well as related state space sequence models, before going through the H3 architecture and implementation, including how the authors introduce a new state-passing technique to enable efficient kernel computations even for very long sequences.
Blogpost Url: https://iclr-blogposts.github.io/staging/blog/2022/hippo-to-h3
ICLR Papers: https://openreview.net/forum?id=COZDy0WYGg, https://openreview.net/forum?id=uYLFoz1vlAC
ID Of The Authors Of The ICLR Paper: ~Tri_Dao1, ~Albert_Gu1
Conflict Of Interest: No
5 Replies

Loading