Here is the code for the Model/Layer/Kernel associated to the paper "LEARNING TO REMEMBER, LEARN, AND FORGET IN
ATTENTION-BASED MODELS". 

While the model is called Palimpsa in the paer, here it is called bma_heads.

The model is build upon FLA repo. 

The files of interest for the review are: 

    fla/ops/bma/chunk_bma_heads_dt.py
    fla/layers/bma_heads_dt.py
    fla/model/bma_heads/configuration_bma_heads_dt.py
    fla/model/bma_heads/modeling_bma_heads_dt.py
