[---------------------------------------------------------- Token Generation ---------------------------------------------------------]
                                                               |  MHA   |  MHA (KV)  |  FlashMHA  |  FlashMHA (KV)  |  H3    |  H3 (KV)
24 threads: ---------------------------------------------------------------------------------------------------------------------------
      [b_size: 1, d_model: 768, seqlen: 1024, n_tokens: 1024]  |   7.2  |    5.2     |     6.0    |       4.7       |  31.3  |    11.4 
      [b_size: 8, d_model: 768, seqlen: 1024, n_tokens: 1024]  |  47.6  |    5.6     |    37.6    |       5.1       |  84.4  |    12.2 

Times are in seconds (s).
