factor 8.0
Unrecognized keys in `rope_scaling` for 'rope_type'='linear': {'type'}
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]Loading checkpoint shards:  50%|█████     | 2/4 [00:00<00:00, 13.21it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.55it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.33it/s]
Token indices sequence length is longer than the specified maximum sequence length for this model (3259 > 2048). Running this sequence through the model will result in indexing errors
Starting ...
Ready.
Quantizing 8bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.486 |
| v_proj.V         | 0.000        | -          | -         | 1.109 |
| q_proj.V         | 0.001        | -          | -         | 1.109 |
| k_proj.U         | 0.000        | -          | -         | 0.151 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.327 |
| o_proj.U         | 0.000        | -          | -         | 0.183 |
| up_proj.V        | 0.001        | -          | -         | 1.354 |
| gate_proj.V      | 0.001        | -          | -         | 1.105 |
| up_proj.U        | 0.000        | -          | -         | 0.166 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.915 |
| down_proj.U      | 0.000        | -          | -         | 0.185 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 22.022       | -          | -         | 1.371 |
| v_proj.V         | 22.261       | -          | -         | 1.111 |
| q_proj.V         | 26.238       | -          | -         | 1.111 |
| k_proj.U         | 0.006        | -          | -         | 0.158 |
| v_proj.U         | 0.005        | -          | -         | 0.010 |
| q_proj.U         | 0.025        | -          | -         | 0.010 |
| o_proj.V         | 0.166        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.191 |
| up_proj.V        | 11.186       | -          | -         | 1.357 |
| gate_proj.V      | 11.288       | -          | -         | 1.111 |
| up_proj.U        | 0.018        | -          | -         | 0.174 |
| gate_proj.U      | 0.018        | -          | -         | 0.010 |
| down_proj.V      | 0.069        | -          | -         | 4.942 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4885.191     | -          | -         | 1.437 |
| v_proj.V         | 2913.025     | -          | -         | 1.123 |
| q_proj.V         | 5444.461     | -          | -         | 1.117 |
| k_proj.U         | 0.019        | -          | -         | 0.477 |
| v_proj.U         | 0.016        | -          | -         | 0.262 |
| q_proj.U         | 0.077        | -          | -         | 0.263 |
| o_proj.V         | 19.629       | -          | -         | 1.406 |
| o_proj.U         | 0.001        | -          | -         | 0.506 |
| up_proj.V        | 2183.230     | -          | -         | 1.430 |
| gate_proj.V      | 2177.325     | -          | -         | 1.120 |
| up_proj.U        | 0.363        | -          | -         | 0.632 |
| gate_proj.U      | 0.365        | -          | -         | 0.396 |
| down_proj.V      | 9.353        | -          | -         | 5.033 |
| down_proj.U      | 0.003        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.000        | -          | -         | 1.376 |
| v_proj.V         | 0.000        | -          | -         | 1.107 |
| q_proj.V         | 0.000        | -          | -         | 1.109 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.330 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.001        | -          | -         | 1.364 |
| gate_proj.V      | 0.001        | -          | -         | 1.110 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.945 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11.595       | -          | -         | 1.375 |
| v_proj.V         | 14.048       | -          | -         | 1.113 |
| q_proj.V         | 12.068       | -          | -         | 1.113 |
| k_proj.U         | 0.011        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.055        | -          | -         | 0.010 |
| o_proj.V         | 0.121        | -          | -         | 1.335 |
| o_proj.U         | 0.001        | -          | -         | 0.193 |
| up_proj.V        | 16.969       | -          | -         | 1.361 |
| gate_proj.V      | 17.019       | -          | -         | 1.114 |
| up_proj.U        | 0.029        | -          | -         | 0.175 |
| gate_proj.U      | 0.030        | -          | -         | 0.010 |
| down_proj.V      | 2.251        | -          | -         | 4.948 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3198.734     | -          | -         | 1.443 |
| v_proj.V         | 2192.922     | -          | -         | 1.122 |
| q_proj.V         | 2861.314     | -          | -         | 1.119 |
| k_proj.U         | 0.020        | -          | -         | 0.474 |
| v_proj.U         | 0.029        | -          | -         | 0.262 |
| q_proj.U         | 0.077        | -          | -         | 0.263 |
| o_proj.V         | 12.177       | -          | -         | 1.406 |
| o_proj.U         | 0.001        | -          | -         | 0.508 |
| up_proj.V        | 4636.287     | -          | -         | 1.434 |
| gate_proj.V      | 4618.969     | -          | -         | 1.118 |
| up_proj.U        | 0.742        | -          | -         | 0.632 |
| gate_proj.U      | 0.747        | -          | -         | 0.399 |
| down_proj.V      | 98.963       | -          | -         | 5.059 |
| down_proj.U      | 0.003        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.378 |
| v_proj.V         | 0.002        | -          | -         | 1.110 |
| q_proj.V         | 0.003        | -          | -         | 1.112 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.336 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.362 |
| gate_proj.V      | 0.001        | -          | -         | 1.107 |
| up_proj.U        | 0.000        | -          | -         | 0.170 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.953 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 51.409       | -          | -         | 1.377 |
| v_proj.V         | 53.998       | -          | -         | 1.113 |
| q_proj.V         | 52.612       | -          | -         | 1.114 |
| k_proj.U         | 0.027        | -          | -         | 0.159 |
| v_proj.U         | 0.017        | -          | -         | 0.010 |
| q_proj.U         | 0.084        | -          | -         | 0.010 |
| o_proj.V         | 0.992        | -          | -         | 1.336 |
| o_proj.U         | 0.004        | -          | -         | 0.193 |
| up_proj.V        | 13.178       | -          | -         | 1.361 |
| gate_proj.V      | 13.418       | -          | -         | 1.110 |
| up_proj.U        | 0.025        | -          | -         | 0.176 |
| gate_proj.U      | 0.024        | -          | -         | 0.010 |
| down_proj.V      | 0.051        | -          | -         | 4.950 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7102.132     | -          | -         | 1.439 |
| v_proj.V         | 6876.722     | -          | -         | 1.118 |
| q_proj.V         | 6896.459     | -          | -         | 1.125 |
| k_proj.U         | 0.101        | -          | -         | 0.475 |
| v_proj.U         | 0.096        | -          | -         | 0.263 |
| q_proj.U         | 0.378        | -          | -         | 0.263 |
| o_proj.V         | 189.638      | -          | -         | 1.405 |
| o_proj.U         | 0.020        | -          | -         | 0.510 |
| up_proj.V        | 2474.925     | -          | -         | 1.428 |
| gate_proj.V      | 2467.291     | -          | -         | 1.123 |
| up_proj.U        | 0.402        | -          | -         | 0.631 |
| gate_proj.U      | 0.407        | -          | -         | 0.398 |
| down_proj.V      | 8.874        | -          | -         | 5.053 |
| down_proj.U      | 0.002        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.002        | -          | -         | 1.373 |
| v_proj.V         | 0.001        | -          | -         | 1.104 |
| q_proj.V         | 0.001        | -          | -         | 1.108 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.107 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.933 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 25.250       | -          | -         | 1.374 |
| v_proj.V         | 26.696       | -          | -         | 1.110 |
| q_proj.V         | 25.603       | -          | -         | 1.112 |
| k_proj.U         | 0.022        | -          | -         | 0.159 |
| v_proj.U         | 0.011        | -          | -         | 0.010 |
| q_proj.U         | 0.040        | -          | -         | 0.010 |
| o_proj.V         | 1.198        | -          | -         | 1.335 |
| o_proj.U         | 0.005        | -          | -         | 0.193 |
| up_proj.V        | 8.790        | -          | -         | 1.359 |
| gate_proj.V      | 8.768        | -          | -         | 1.110 |
| up_proj.U        | 0.026        | -          | -         | 0.175 |
| gate_proj.U      | 0.024        | -          | -         | 0.010 |
| down_proj.V      | 0.041        | -          | -         | 4.935 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3729.053     | -          | -         | 1.441 |
| v_proj.V         | 3799.447     | -          | -         | 1.119 |
| q_proj.V         | 3733.687     | -          | -         | 1.124 |
| k_proj.U         | 0.057        | -          | -         | 0.474 |
| v_proj.U         | 0.060        | -          | -         | 0.263 |
| q_proj.U         | 0.211        | -          | -         | 0.262 |
| o_proj.V         | 225.402      | -          | -         | 1.407 |
| o_proj.U         | 0.031        | -          | -         | 0.508 |
| up_proj.V        | 1827.398     | -          | -         | 1.433 |
| gate_proj.V      | 1819.894     | -          | -         | 1.118 |
| up_proj.U        | 0.308        | -          | -         | 0.629 |
| gate_proj.U      | 0.310        | -          | -         | 0.398 |
| down_proj.V      | 9.363        | -          | -         | 5.048 |
| down_proj.U      | 0.002        | -          | -         | 0.647 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.373 |
| v_proj.V         | 0.000        | -          | -         | 1.109 |
| q_proj.V         | 0.001        | -          | -         | 1.112 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.361 |
| gate_proj.V      | 0.000        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.926 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.434       | -          | -         | 1.373 |
| v_proj.V         | 14.017       | -          | -         | 1.110 |
| q_proj.V         | 13.358       | -          | -         | 1.112 |
| k_proj.U         | 0.010        | -          | -         | 0.160 |
| v_proj.U         | 0.006        | -          | -         | 0.010 |
| q_proj.U         | 0.024        | -          | -         | 0.010 |
| o_proj.V         | 1.127        | -          | -         | 1.333 |
| o_proj.U         | 0.008        | -          | -         | 0.193 |
| up_proj.V        | 6.754        | -          | -         | 1.361 |
| gate_proj.V      | 6.849        | -          | -         | 1.113 |
| up_proj.U        | 0.025        | -          | -         | 0.175 |
| gate_proj.U      | 0.025        | -          | -         | 0.010 |
| down_proj.V      | 0.053        | -          | -         | 4.943 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2148.458     | -          | -         | 1.442 |
| v_proj.V         | 2164.447     | -          | -         | 1.122 |
| q_proj.V         | 2147.286     | -          | -         | 1.118 |
| k_proj.U         | 0.032        | -          | -         | 0.474 |
| v_proj.U         | 0.031        | -          | -         | 0.262 |
| q_proj.U         | 0.108        | -          | -         | 0.264 |
| o_proj.V         | 220.417      | -          | -         | 1.407 |
| o_proj.U         | 0.027        | -          | -         | 0.508 |
| up_proj.V        | 1604.260     | -          | -         | 1.433 |
| gate_proj.V      | 1595.754     | -          | -         | 1.121 |
| up_proj.U        | 0.260        | -          | -         | 0.632 |
| gate_proj.U      | 0.271        | -          | -         | 0.398 |
| down_proj.V      | 12.210       | -          | -         | 5.052 |
| down_proj.U      | 0.003        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.111 |
| q_proj.V         | 0.001        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.359 |
| gate_proj.V      | 0.000        | -          | -         | 1.110 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.931 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 14.215       | -          | -         | 1.375 |
| v_proj.V         | 14.770       | -          | -         | 1.111 |
| q_proj.V         | 14.156       | -          | -         | 1.111 |
| k_proj.U         | 0.011        | -          | -         | 0.159 |
| v_proj.U         | 0.008        | -          | -         | 0.010 |
| q_proj.U         | 0.027        | -          | -         | 0.010 |
| o_proj.V         | 1.013        | -          | -         | 1.342 |
| o_proj.U         | 0.009        | -          | -         | 0.192 |
| up_proj.V        | 6.428        | -          | -         | 1.362 |
| gate_proj.V      | 6.394        | -          | -         | 1.111 |
| up_proj.U        | 0.031        | -          | -         | 0.176 |
| gate_proj.U      | 0.029        | -          | -         | 0.010 |
| down_proj.V      | 0.066        | -          | -         | 4.943 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2356.418     | -          | -         | 1.441 |
| v_proj.V         | 2387.234     | -          | -         | 1.118 |
| q_proj.V         | 2354.081     | -          | -         | 1.118 |
| k_proj.U         | 0.039        | -          | -         | 0.474 |
| v_proj.U         | 0.033        | -          | -         | 0.264 |
| q_proj.U         | 0.140        | -          | -         | 0.265 |
| o_proj.V         | 190.506      | -          | -         | 1.406 |
| o_proj.U         | 0.031        | -          | -         | 0.510 |
| up_proj.V        | 1562.197     | -          | -         | 1.425 |
| gate_proj.V      | 1552.081     | -          | -         | 1.120 |
| up_proj.U        | 0.257        | -          | -         | 0.631 |
| gate_proj.U      | 0.265        | -          | -         | 0.398 |
| down_proj.V      | 15.329       | -          | -         | 5.045 |
| down_proj.U      | 0.004        | -          | -         | 0.647 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.369 |
| v_proj.V         | 0.000        | -          | -         | 1.106 |
| q_proj.V         | 0.000        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.333 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.109 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.938 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11.615       | -          | -         | 1.374 |
| v_proj.V         | 12.091       | -          | -         | 1.109 |
| q_proj.V         | 11.589       | -          | -         | 1.113 |
| k_proj.U         | 0.015        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.028        | -          | -         | 0.010 |
| o_proj.V         | 1.171        | -          | -         | 1.334 |
| o_proj.U         | 0.013        | -          | -         | 0.193 |
| up_proj.V        | 5.914        | -          | -         | 1.362 |
| gate_proj.V      | 5.848        | -          | -         | 1.112 |
| up_proj.U        | 0.038        | -          | -         | 0.176 |
| gate_proj.U      | 0.038        | -          | -         | 0.010 |
| down_proj.V      | 0.067        | -          | -         | 4.939 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 1964.518     | -          | -         | 1.441 |
| v_proj.V         | 2006.356     | -          | -         | 1.120 |
| q_proj.V         | 1981.418     | -          | -         | 1.121 |
| k_proj.U         | 0.035        | -          | -         | 0.475 |
| v_proj.U         | 0.030        | -          | -         | 0.264 |
| q_proj.U         | 0.122        | -          | -         | 0.262 |
| o_proj.V         | 227.371      | -          | -         | 1.411 |
| o_proj.U         | 0.038        | -          | -         | 0.508 |
| up_proj.V        | 1503.967     | -          | -         | 1.431 |
| gate_proj.V      | 1491.987     | -          | -         | 1.118 |
| up_proj.U        | 0.242        | -          | -         | 0.629 |
| gate_proj.U      | 0.249        | -          | -         | 0.400 |
| down_proj.V      | 15.433       | -          | -         | 5.052 |
| down_proj.U      | 0.005        | -          | -         | 0.647 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.000        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.110 |
| q_proj.V         | 0.000        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.359 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.945 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11.090       | -          | -         | 1.373 |
| v_proj.V         | 11.549       | -          | -         | 1.113 |
| q_proj.V         | 11.106       | -          | -         | 1.113 |
| k_proj.U         | 0.011        | -          | -         | 0.159 |
| v_proj.U         | 0.006        | -          | -         | 0.010 |
| q_proj.U         | 0.026        | -          | -         | 0.010 |
| o_proj.V         | 1.097        | -          | -         | 1.336 |
| o_proj.U         | 0.019        | -          | -         | 0.193 |
| up_proj.V        | 5.343        | -          | -         | 1.363 |
| gate_proj.V      | 5.335        | -          | -         | 1.114 |
| up_proj.U        | 0.033        | -          | -         | 0.175 |
| gate_proj.U      | 0.033        | -          | -         | 0.010 |
| down_proj.V      | 0.064        | -          | -         | 4.949 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 1874.150     | -          | -         | 1.441 |
| v_proj.V         | 1918.358     | -          | -         | 1.125 |
| q_proj.V         | 1889.610     | -          | -         | 1.119 |
| k_proj.U         | 0.034        | -          | -         | 0.475 |
| v_proj.U         | 0.027        | -          | -         | 0.263 |
| q_proj.U         | 0.112        | -          | -         | 0.264 |
| o_proj.V         | 204.268      | -          | -         | 1.406 |
| o_proj.U         | 0.037        | -          | -         | 0.509 |
| up_proj.V        | 1355.400     | -          | -         | 1.427 |
| gate_proj.V      | 1345.483     | -          | -         | 1.117 |
| up_proj.U        | 0.223        | -          | -         | 0.631 |
| gate_proj.U      | 0.233        | -          | -         | 0.399 |
| down_proj.V      | 14.508       | -          | -         | 5.049 |
| down_proj.U      | 0.005        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.378 |
| v_proj.V         | 0.000        | -          | -         | 1.110 |
| q_proj.V         | 0.000        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.334 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.358 |
| gate_proj.V      | 0.000        | -          | -         | 1.109 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.946 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 12.129       | -          | -         | 1.377 |
| v_proj.V         | 12.452       | -          | -         | 1.115 |
| q_proj.V         | 12.092       | -          | -         | 1.115 |
| k_proj.U         | 0.011        | -          | -         | 0.159 |
| v_proj.U         | 0.008        | -          | -         | 0.010 |
| q_proj.U         | 0.033        | -          | -         | 0.010 |
| o_proj.V         | 1.189        | -          | -         | 1.335 |
| o_proj.U         | 0.018        | -          | -         | 0.192 |
| up_proj.V        | 4.991        | -          | -         | 1.359 |
| gate_proj.V      | 5.029        | -          | -         | 1.111 |
| up_proj.U        | 0.030        | -          | -         | 0.176 |
| gate_proj.U      | 0.029        | -          | -         | 0.010 |
| down_proj.V      | 0.052        | -          | -         | 4.944 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2075.198     | -          | -         | 1.440 |
| v_proj.V         | 2107.222     | -          | -         | 1.115 |
| q_proj.V         | 2084.666     | -          | -         | 1.116 |
| k_proj.U         | 0.037        | -          | -         | 0.475 |
| v_proj.U         | 0.030        | -          | -         | 0.264 |
| q_proj.U         | 0.132        | -          | -         | 0.264 |
| o_proj.V         | 241.626      | -          | -         | 1.408 |
| o_proj.U         | 0.042        | -          | -         | 0.510 |
| up_proj.V        | 1261.548     | -          | -         | 1.429 |
| gate_proj.V      | 1255.937     | -          | -         | 1.121 |
| up_proj.U        | 0.198        | -          | -         | 0.631 |
| gate_proj.U      | 0.208        | -          | -         | 0.399 |
| down_proj.V      | 11.731       | -          | -         | 5.042 |
| down_proj.U      | 0.004        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.374 |
| v_proj.V         | 0.000        | -          | -         | 1.109 |
| q_proj.V         | 0.000        | -          | -         | 1.112 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.361 |
| gate_proj.V      | 0.000        | -          | -         | 1.108 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.933 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11.751       | -          | -         | 1.372 |
| v_proj.V         | 11.990       | -          | -         | 1.110 |
| q_proj.V         | 11.693       | -          | -         | 1.112 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.028        | -          | -         | 0.010 |
| o_proj.V         | 1.206        | -          | -         | 1.333 |
| o_proj.U         | 0.018        | -          | -         | 0.193 |
| up_proj.V        | 4.666        | -          | -         | 1.358 |
| gate_proj.V      | 4.654        | -          | -         | 1.111 |
| up_proj.U        | 0.030        | -          | -         | 0.175 |
| gate_proj.U      | 0.030        | -          | -         | 0.010 |
| down_proj.V      | 0.052        | -          | -         | 4.931 |
| down_proj.U      | 0.001        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2004.092     | -          | -         | 1.444 |
| v_proj.V         | 2038.089     | -          | -         | 1.121 |
| q_proj.V         | 2012.924     | -          | -         | 1.122 |
| k_proj.U         | 0.036        | -          | -         | 0.475 |
| v_proj.U         | 0.027        | -          | -         | 0.264 |
| q_proj.U         | 0.116        | -          | -         | 0.264 |
| o_proj.V         | 291.785      | -          | -         | 1.409 |
| o_proj.U         | 0.049        | -          | -         | 0.508 |
| up_proj.V        | 1182.637     | -          | -         | 1.433 |
| gate_proj.V      | 1174.229     | -          | -         | 1.122 |
| up_proj.U        | 0.174        | -          | -         | 0.630 |
| gate_proj.U      | 0.183        | -          | -         | 0.398 |
| down_proj.V      | 12.250       | -          | -         | 5.048 |
| down_proj.U      | 0.004        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.109 |
| q_proj.V         | 0.000        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.334 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.359 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.939 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.441       | -          | -         | 1.378 |
| v_proj.V         | 13.891       | -          | -         | 1.113 |
| q_proj.V         | 13.495       | -          | -         | 1.116 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.033        | -          | -         | 0.010 |
| o_proj.V         | 1.270        | -          | -         | 1.340 |
| o_proj.U         | 0.017        | -          | -         | 0.193 |
| up_proj.V        | 4.484        | -          | -         | 1.362 |
| gate_proj.V      | 4.472        | -          | -         | 1.117 |
| up_proj.U        | 0.031        | -          | -         | 0.175 |
| gate_proj.U      | 0.029        | -          | -         | 0.010 |
| down_proj.V      | 0.047        | -          | -         | 4.953 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2324.168     | -          | -         | 1.445 |
| v_proj.V         | 2355.489     | -          | -         | 1.123 |
| q_proj.V         | 2333.230     | -          | -         | 1.119 |
| k_proj.U         | 0.044        | -          | -         | 0.475 |
| v_proj.U         | 0.028        | -          | -         | 0.264 |
| q_proj.U         | 0.133        | -          | -         | 0.264 |
| o_proj.V         | 232.978      | -          | -         | 1.407 |
| o_proj.U         | 0.041        | -          | -         | 0.508 |
| up_proj.V        | 1161.507     | -          | -         | 1.430 |
| gate_proj.V      | 1155.435     | -          | -         | 1.119 |
| up_proj.U        | 0.171        | -          | -         | 0.632 |
| gate_proj.U      | 0.182        | -          | -         | 0.400 |
| down_proj.V      | 10.951       | -          | -         | 5.055 |
| down_proj.U      | 0.004        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.113 |
| q_proj.V         | 0.000        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.938 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 12.878       | -          | -         | 1.380 |
| v_proj.V         | 13.660       | -          | -         | 1.117 |
| q_proj.V         | 13.198       | -          | -         | 1.115 |
| k_proj.U         | 0.008        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.026        | -          | -         | 0.010 |
| o_proj.V         | 1.082        | -          | -         | 1.337 |
| o_proj.U         | 0.020        | -          | -         | 0.193 |
| up_proj.V        | 4.540        | -          | -         | 1.363 |
| gate_proj.V      | 4.457        | -          | -         | 1.117 |
| up_proj.U        | 0.029        | -          | -         | 0.176 |
| gate_proj.U      | 0.030        | -          | -         | 0.010 |
| down_proj.V      | 0.049        | -          | -         | 4.948 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2179.930     | -          | -         | 1.437 |
| v_proj.V         | 2227.468     | -          | -         | 1.117 |
| q_proj.V         | 2193.603     | -          | -         | 1.119 |
| k_proj.U         | 0.038        | -          | -         | 0.474 |
| v_proj.U         | 0.026        | -          | -         | 0.264 |
| q_proj.U         | 0.122        | -          | -         | 0.265 |
| o_proj.V         | 215.392      | -          | -         | 1.408 |
| o_proj.U         | 0.047        | -          | -         | 0.510 |
| up_proj.V        | 1163.243     | -          | -         | 1.432 |
| gate_proj.V      | 1150.977     | -          | -         | 1.123 |
| up_proj.U        | 0.161        | -          | -         | 0.631 |
| gate_proj.U      | 0.171        | -          | -         | 0.398 |
| down_proj.V      | 10.958       | -          | -         | 5.036 |
| down_proj.U      | 0.003        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.000        | -          | -         | 1.374 |
| v_proj.V         | 0.000        | -          | -         | 1.108 |
| q_proj.V         | 0.000        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.943 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 10.526       | -          | -         | 1.377 |
| v_proj.V         | 11.266       | -          | -         | 1.112 |
| q_proj.V         | 10.744       | -          | -         | 1.115 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.005        | -          | -         | 0.010 |
| q_proj.U         | 0.024        | -          | -         | 0.010 |
| o_proj.V         | 1.267        | -          | -         | 1.338 |
| o_proj.U         | 0.025        | -          | -         | 0.193 |
| up_proj.V        | 4.369        | -          | -         | 1.360 |
| gate_proj.V      | 4.282        | -          | -         | 1.115 |
| up_proj.U        | 0.031        | -          | -         | 0.176 |
| gate_proj.U      | 0.031        | -          | -         | 0.010 |
| down_proj.V      | 0.053        | -          | -         | 4.942 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 1820.224     | -          | -         | 1.441 |
| v_proj.V         | 1870.093     | -          | -         | 1.120 |
| q_proj.V         | 1845.285     | -          | -         | 1.121 |
| k_proj.U         | 0.030        | -          | -         | 0.475 |
| v_proj.U         | 0.023        | -          | -         | 0.264 |
| q_proj.U         | 0.100        | -          | -         | 0.263 |
| o_proj.V         | 299.207      | -          | -         | 1.412 |
| o_proj.U         | 0.070        | -          | -         | 0.511 |
| up_proj.V        | 1122.739     | -          | -         | 1.437 |
| gate_proj.V      | 1113.931     | -          | -         | 1.120 |
| up_proj.U        | 0.181        | -          | -         | 0.631 |
| gate_proj.U      | 0.192        | -          | -         | 0.400 |
| down_proj.V      | 11.769       | -          | -         | 5.058 |
| down_proj.U      | 0.003        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.377 |
| v_proj.V         | 0.000        | -          | -         | 1.110 |
| q_proj.V         | 0.001        | -          | -         | 1.112 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.114 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.943 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.590       | -          | -         | 1.376 |
| v_proj.V         | 14.517       | -          | -         | 1.111 |
| q_proj.V         | 13.653       | -          | -         | 1.113 |
| k_proj.U         | 0.010        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.040        | -          | -         | 0.010 |
| o_proj.V         | 1.474        | -          | -         | 1.337 |
| o_proj.U         | 0.032        | -          | -         | 0.193 |
| up_proj.V        | 4.365        | -          | -         | 1.361 |
| gate_proj.V      | 4.395        | -          | -         | 1.114 |
| up_proj.U        | 0.032        | -          | -         | 0.176 |
| gate_proj.U      | 0.034        | -          | -         | 0.010 |
| down_proj.V      | 0.057        | -          | -         | 4.947 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2329.088     | -          | -         | 1.445 |
| v_proj.V         | 2378.341     | -          | -         | 1.124 |
| q_proj.V         | 2354.843     | -          | -         | 1.124 |
| k_proj.U         | 0.050        | -          | -         | 0.475 |
| v_proj.U         | 0.035        | -          | -         | 0.264 |
| q_proj.U         | 0.165        | -          | -         | 0.264 |
| o_proj.V         | 317.403      | -          | -         | 1.410 |
| o_proj.U         | 0.068        | -          | -         | 0.510 |
| up_proj.V        | 1141.204     | -          | -         | 1.436 |
| gate_proj.V      | 1132.167     | -          | -         | 1.124 |
| up_proj.U        | 0.202        | -          | -         | 0.632 |
| gate_proj.U      | 0.217        | -          | -         | 0.400 |
| down_proj.V      | 12.795       | -          | -         | 5.056 |
| down_proj.U      | 0.004        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.380 |
| v_proj.V         | 0.000        | -          | -         | 1.116 |
| q_proj.V         | 0.001        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.110 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.943 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 14.442       | -          | -         | 1.376 |
| v_proj.V         | 15.226       | -          | -         | 1.116 |
| q_proj.V         | 14.800       | -          | -         | 1.117 |
| k_proj.U         | 0.014        | -          | -         | 0.159 |
| v_proj.U         | 0.008        | -          | -         | 0.010 |
| q_proj.U         | 0.043        | -          | -         | 0.010 |
| o_proj.V         | 1.345        | -          | -         | 1.337 |
| o_proj.U         | 0.023        | -          | -         | 0.193 |
| up_proj.V        | 4.921        | -          | -         | 1.363 |
| gate_proj.V      | 4.924        | -          | -         | 1.116 |
| up_proj.U        | 0.041        | -          | -         | 0.176 |
| gate_proj.U      | 0.041        | -          | -         | 0.010 |
| down_proj.V      | 0.082        | -          | -         | 4.944 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2502.489     | -          | -         | 1.439 |
| v_proj.V         | 2562.024     | -          | -         | 1.118 |
| q_proj.V         | 2530.298     | -          | -         | 1.118 |
| k_proj.U         | 0.048        | -          | -         | 0.475 |
| v_proj.U         | 0.035        | -          | -         | 0.264 |
| q_proj.U         | 0.150        | -          | -         | 0.265 |
| o_proj.V         | 273.739      | -          | -         | 1.408 |
| o_proj.U         | 0.060        | -          | -         | 0.511 |
| up_proj.V        | 1323.417     | -          | -         | 1.433 |
| gate_proj.V      | 1314.378     | -          | -         | 1.122 |
| up_proj.U        | 0.243        | -          | -         | 0.631 |
| gate_proj.U      | 0.264        | -          | -         | 0.400 |
| down_proj.V      | 19.422       | -          | -         | 5.045 |
| down_proj.U      | 0.006        | -          | -         | 0.650 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.110 |
| q_proj.V         | 0.000        | -          | -         | 1.116 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.109 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.937 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.373       | -          | -         | 1.379 |
| v_proj.V         | 14.007       | -          | -         | 1.112 |
| q_proj.V         | 13.535       | -          | -         | 1.114 |
| k_proj.U         | 0.013        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.038        | -          | -         | 0.010 |
| o_proj.V         | 1.260        | -          | -         | 1.339 |
| o_proj.U         | 0.012        | -          | -         | 0.193 |
| up_proj.V        | 5.359        | -          | -         | 1.359 |
| gate_proj.V      | 5.294        | -          | -         | 1.115 |
| up_proj.U        | 0.042        | -          | -         | 0.176 |
| gate_proj.U      | 0.043        | -          | -         | 0.010 |
| down_proj.V      | 0.076        | -          | -         | 4.946 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2288.207     | -          | -         | 1.445 |
| v_proj.V         | 2355.771     | -          | -         | 1.122 |
| q_proj.V         | 2320.146     | -          | -         | 1.126 |
| k_proj.U         | 0.051        | -          | -         | 0.475 |
| v_proj.U         | 0.038        | -          | -         | 0.265 |
| q_proj.U         | 0.161        | -          | -         | 0.264 |
| o_proj.V         | 294.611      | -          | -         | 1.414 |
| o_proj.U         | 0.062        | -          | -         | 0.510 |
| up_proj.V        | 1433.704     | -          | -         | 1.433 |
| gate_proj.V      | 1423.358     | -          | -         | 1.121 |
| up_proj.U        | 0.273        | -          | -         | 0.631 |
| gate_proj.U      | 0.290        | -          | -         | 0.400 |
| down_proj.V      | 17.944       | -          | -         | 5.067 |
| down_proj.U      | 0.006        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.377 |
| v_proj.V         | 0.000        | -          | -         | 1.109 |
| q_proj.V         | 0.001        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.336 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.361 |
| gate_proj.V      | 0.000        | -          | -         | 1.112 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.950 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 14.253       | -          | -         | 1.376 |
| v_proj.V         | 15.203       | -          | -         | 1.115 |
| q_proj.V         | 14.226       | -          | -         | 1.117 |
| k_proj.U         | 0.015        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.049        | -          | -         | 0.010 |
| o_proj.V         | 1.692        | -          | -         | 1.340 |
| o_proj.U         | 0.028        | -          | -         | 0.193 |
| up_proj.V        | 5.531        | -          | -         | 1.362 |
| gate_proj.V      | 5.583        | -          | -         | 1.118 |
| up_proj.U        | 0.043        | -          | -         | 0.176 |
| gate_proj.U      | 0.046        | -          | -         | 0.010 |
| down_proj.V      | 0.074        | -          | -         | 4.948 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2380.486     | -          | -         | 1.445 |
| v_proj.V         | 2441.188     | -          | -         | 1.123 |
| q_proj.V         | 2397.876     | -          | -         | 1.125 |
| k_proj.U         | 0.051        | -          | -         | 0.476 |
| v_proj.U         | 0.038        | -          | -         | 0.264 |
| q_proj.U         | 0.143        | -          | -         | 0.265 |
| o_proj.V         | 374.623      | -          | -         | 1.408 |
| o_proj.U         | 0.082        | -          | -         | 0.510 |
| up_proj.V        | 1461.615     | -          | -         | 1.437 |
| gate_proj.V      | 1448.761     | -          | -         | 1.121 |
| up_proj.U        | 0.267        | -          | -         | 0.633 |
| gate_proj.U      | 0.282        | -          | -         | 0.399 |
| down_proj.V      | 17.089       | -          | -         | 5.062 |
| down_proj.U      | 0.006        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.379 |
| v_proj.V         | 0.000        | -          | -         | 1.115 |
| q_proj.V         | 0.000        | -          | -         | 1.115 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.337 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.936 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.019       | -          | -         | 1.379 |
| v_proj.V         | 13.838       | -          | -         | 1.116 |
| q_proj.V         | 13.225       | -          | -         | 1.116 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.039        | -          | -         | 0.010 |
| o_proj.V         | 1.370        | -          | -         | 1.337 |
| o_proj.U         | 0.016        | -          | -         | 0.193 |
| up_proj.V        | 5.589        | -          | -         | 1.363 |
| gate_proj.V      | 5.550        | -          | -         | 1.114 |
| up_proj.U        | 0.048        | -          | -         | 0.176 |
| gate_proj.U      | 0.054        | -          | -         | 0.010 |
| down_proj.V      | 0.072        | -          | -         | 4.945 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2231.999     | -          | -         | 1.445 |
| v_proj.V         | 2294.228     | -          | -         | 1.120 |
| q_proj.V         | 2258.908     | -          | -         | 1.121 |
| k_proj.U         | 0.046        | -          | -         | 0.476 |
| v_proj.U         | 0.035        | -          | -         | 0.266 |
| q_proj.U         | 0.145        | -          | -         | 0.267 |
| o_proj.V         | 291.554      | -          | -         | 1.411 |
| o_proj.U         | 0.052        | -          | -         | 0.511 |
| up_proj.V        | 1494.507     | -          | -         | 1.429 |
| gate_proj.V      | 1477.979     | -          | -         | 1.119 |
| up_proj.U        | 0.268        | -          | -         | 0.632 |
| gate_proj.U      | 0.279        | -          | -         | 0.400 |
| down_proj.V      | 17.016       | -          | -         | 5.043 |
| down_proj.U      | 0.006        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.376 |
| v_proj.V         | 0.000        | -          | -         | 1.108 |
| q_proj.V         | 0.000        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.333 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.114 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.944 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.854       | -          | -         | 1.378 |
| v_proj.V         | 14.520       | -          | -         | 1.112 |
| q_proj.V         | 13.722       | -          | -         | 1.115 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.040        | -          | -         | 0.010 |
| o_proj.V         | 1.490        | -          | -         | 1.338 |
| o_proj.U         | 0.014        | -          | -         | 0.193 |
| up_proj.V        | 5.955        | -          | -         | 1.364 |
| gate_proj.V      | 5.908        | -          | -         | 1.114 |
| up_proj.U        | 0.058        | -          | -         | 0.175 |
| gate_proj.U      | 0.059        | -          | -         | 0.010 |
| down_proj.V      | 0.080        | -          | -         | 4.944 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2244.368     | -          | -         | 1.445 |
| v_proj.V         | 2322.448     | -          | -         | 1.125 |
| q_proj.V         | 2256.393     | -          | -         | 1.125 |
| k_proj.U         | 0.049        | -          | -         | 0.476 |
| v_proj.U         | 0.034        | -          | -         | 0.265 |
| q_proj.U         | 0.151        | -          | -         | 0.265 |
| o_proj.V         | 289.717      | -          | -         | 1.417 |
| o_proj.U         | 0.052        | -          | -         | 0.511 |
| up_proj.V        | 1551.481     | -          | -         | 1.437 |
| gate_proj.V      | 1539.132     | -          | -         | 1.123 |
| up_proj.U        | 0.274        | -          | -         | 0.631 |
| gate_proj.U      | 0.286        | -          | -         | 0.400 |
| down_proj.V      | 20.551       | -          | -         | 5.053 |
| down_proj.U      | 0.007        | -          | -         | 0.650 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.379 |
| v_proj.V         | 0.000        | -          | -         | 1.111 |
| q_proj.V         | 0.000        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.364 |
| gate_proj.V      | 0.000        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.943 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13.323       | -          | -         | 1.376 |
| v_proj.V         | 13.881       | -          | -         | 1.115 |
| q_proj.V         | 13.473       | -          | -         | 1.115 |
| k_proj.U         | 0.012        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.046        | -          | -         | 0.010 |
| o_proj.V         | 2.042        | -          | -         | 1.339 |
| o_proj.U         | 0.014        | -          | -         | 0.193 |
| up_proj.V        | 6.229        | -          | -         | 1.364 |
| gate_proj.V      | 6.080        | -          | -         | 1.117 |
| up_proj.U        | 0.050        | -          | -         | 0.176 |
| gate_proj.U      | 0.053        | -          | -         | 0.010 |
| down_proj.V      | 0.114        | -          | -         | 4.950 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2289.964     | -          | -         | 1.446 |
| v_proj.V         | 2348.219     | -          | -         | 1.123 |
| q_proj.V         | 2310.264     | -          | -         | 1.123 |
| k_proj.U         | 0.049        | -          | -         | 0.476 |
| v_proj.U         | 0.034        | -          | -         | 0.264 |
| q_proj.U         | 0.145        | -          | -         | 0.264 |
| o_proj.V         | 363.340      | -          | -         | 1.408 |
| o_proj.U         | 0.053        | -          | -         | 0.508 |
| up_proj.V        | 1719.247     | -          | -         | 1.436 |
| gate_proj.V      | 1702.283     | -          | -         | 1.122 |
| up_proj.U        | 0.290        | -          | -         | 0.634 |
| gate_proj.U      | 0.300        | -          | -         | 0.399 |
| down_proj.V      | 27.196       | -          | -         | 5.050 |
| down_proj.U      | 0.008        | -          | -         | 0.650 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.375 |
| v_proj.V         | 0.000        | -          | -         | 1.113 |
| q_proj.V         | 0.001        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.363 |
| gate_proj.V      | 0.000        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.945 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 15.009       | -          | -         | 1.376 |
| v_proj.V         | 15.787       | -          | -         | 1.116 |
| q_proj.V         | 14.986       | -          | -         | 1.116 |
| k_proj.U         | 0.016        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.051        | -          | -         | 0.010 |
| o_proj.V         | 1.865        | -          | -         | 1.338 |
| o_proj.U         | 0.013        | -          | -         | 0.193 |
| up_proj.V        | 6.698        | -          | -         | 1.362 |
| gate_proj.V      | 6.684        | -          | -         | 1.114 |
| up_proj.U        | 0.055        | -          | -         | 0.176 |
| gate_proj.U      | 0.056        | -          | -         | 0.010 |
| down_proj.V      | 0.103        | -          | -         | 4.955 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2468.605     | -          | -         | 1.443 |
| v_proj.V         | 2557.456     | -          | -         | 1.121 |
| q_proj.V         | 2493.960     | -          | -         | 1.121 |
| k_proj.U         | 0.051        | -          | -         | 0.476 |
| v_proj.U         | 0.034        | -          | -         | 0.265 |
| q_proj.U         | 0.151        | -          | -         | 0.265 |
| o_proj.V         | 334.087      | -          | -         | 1.411 |
| o_proj.U         | 0.052        | -          | -         | 0.510 |
| up_proj.V        | 1767.005     | -          | -         | 1.431 |
| gate_proj.V      | 1745.509     | -          | -         | 1.126 |
| up_proj.U        | 0.307        | -          | -         | 0.632 |
| gate_proj.U      | 0.320        | -          | -         | 0.400 |
| down_proj.V      | 25.749       | -          | -         | 5.044 |
| down_proj.U      | 0.008        | -          | -         | 0.652 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.376 |
| v_proj.V         | 0.000        | -          | -         | 1.108 |
| q_proj.V         | 0.000        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.333 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.364 |
| gate_proj.V      | 0.000        | -          | -         | 1.109 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.935 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 14.933       | -          | -         | 1.379 |
| v_proj.V         | 15.807       | -          | -         | 1.115 |
| q_proj.V         | 15.042       | -          | -         | 1.115 |
| k_proj.U         | 0.014        | -          | -         | 0.159 |
| v_proj.U         | 0.009        | -          | -         | 0.010 |
| q_proj.U         | 0.047        | -          | -         | 0.010 |
| o_proj.V         | 2.348        | -          | -         | 1.339 |
| o_proj.U         | 0.020        | -          | -         | 0.194 |
| up_proj.V        | 6.803        | -          | -         | 1.363 |
| gate_proj.V      | 6.801        | -          | -         | 1.110 |
| up_proj.U        | 0.057        | -          | -         | 0.176 |
| gate_proj.U      | 0.062        | -          | -         | 0.010 |
| down_proj.V      | 0.134        | -          | -         | 4.943 |
| down_proj.U      | 0.002        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2558.090     | -          | -         | 1.445 |
| v_proj.V         | 2633.051     | -          | -         | 1.121 |
| q_proj.V         | 2585.997     | -          | -         | 1.125 |
| k_proj.U         | 0.052        | -          | -         | 0.476 |
| v_proj.U         | 0.037        | -          | -         | 0.265 |
| q_proj.U         | 0.155        | -          | -         | 0.265 |
| o_proj.V         | 445.511      | -          | -         | 1.416 |
| o_proj.U         | 0.083        | -          | -         | 0.511 |
| up_proj.V        | 1808.310     | -          | -         | 1.437 |
| gate_proj.V      | 1788.926     | -          | -         | 1.121 |
| up_proj.U        | 0.327        | -          | -         | 0.631 |
| gate_proj.U      | 0.332        | -          | -         | 0.399 |
| down_proj.V      | 32.135       | -          | -         | 5.060 |
| down_proj.U      | 0.010        | -          | -         | 0.650 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.380 |
| v_proj.V         | 0.001        | -          | -         | 1.110 |
| q_proj.V         | 0.001        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.115 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.950 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 17.525       | -          | -         | 1.375 |
| v_proj.V         | 18.232       | -          | -         | 1.116 |
| q_proj.V         | 17.522       | -          | -         | 1.117 |
| k_proj.U         | 0.016        | -          | -         | 0.160 |
| v_proj.U         | 0.012        | -          | -         | 0.010 |
| q_proj.U         | 0.057        | -          | -         | 0.010 |
| o_proj.V         | 2.251        | -          | -         | 1.341 |
| o_proj.U         | 0.016        | -          | -         | 0.194 |
| up_proj.V        | 7.307        | -          | -         | 1.366 |
| gate_proj.V      | 7.230        | -          | -         | 1.117 |
| up_proj.U        | 0.064        | -          | -         | 0.176 |
| gate_proj.U      | 0.067        | -          | -         | 0.010 |
| down_proj.V      | 0.117        | -          | -         | 4.949 |
| down_proj.U      | 0.001        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3017.089     | -          | -         | 1.446 |
| v_proj.V         | 3076.974     | -          | -         | 1.125 |
| q_proj.V         | 3046.581     | -          | -         | 1.124 |
| k_proj.U         | 0.064        | -          | -         | 0.477 |
| v_proj.U         | 0.049        | -          | -         | 0.264 |
| q_proj.U         | 0.202        | -          | -         | 0.265 |
| o_proj.V         | 423.701      | -          | -         | 1.411 |
| o_proj.U         | 0.093        | -          | -         | 0.510 |
| up_proj.V        | 2006.455     | -          | -         | 1.438 |
| gate_proj.V      | 1991.647     | -          | -         | 1.125 |
| up_proj.U        | 0.371        | -          | -         | 0.632 |
| gate_proj.U      | 0.378        | -          | -         | 0.402 |
| down_proj.V      | 30.169       | -          | -         | 5.058 |
| down_proj.U      | 0.010        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.376 |
| v_proj.V         | 0.001        | -          | -         | 1.112 |
| q_proj.V         | 0.001        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.362 |
| gate_proj.V      | 0.000        | -          | -         | 1.112 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.927 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 17.809       | -          | -         | 1.381 |
| v_proj.V         | 18.047       | -          | -         | 1.117 |
| q_proj.V         | 17.680       | -          | -         | 1.118 |
| k_proj.U         | 0.013        | -          | -         | 0.159 |
| v_proj.U         | 0.012        | -          | -         | 0.010 |
| q_proj.U         | 0.059        | -          | -         | 0.010 |
| o_proj.V         | 3.510        | -          | -         | 1.340 |
| o_proj.U         | 0.016        | -          | -         | 0.193 |
| up_proj.V        | 7.968        | -          | -         | 1.362 |
| gate_proj.V      | 7.952        | -          | -         | 1.115 |
| up_proj.U        | 0.065        | -          | -         | 0.176 |
| gate_proj.U      | 0.067        | -          | -         | 0.010 |
| down_proj.V      | 0.158        | -          | -         | 4.955 |
| down_proj.U      | 0.002        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2918.107     | -          | -         | 1.441 |
| v_proj.V         | 2994.024     | -          | -         | 1.119 |
| q_proj.V         | 2949.477     | -          | -         | 1.122 |
| k_proj.U         | 0.057        | -          | -         | 0.474 |
| v_proj.U         | 0.045        | -          | -         | 0.265 |
| q_proj.U         | 0.176        | -          | -         | 0.266 |
| o_proj.V         | 639.983      | -          | -         | 1.408 |
| o_proj.U         | 0.108        | -          | -         | 0.512 |
| up_proj.V        | 2159.153     | -          | -         | 1.431 |
| gate_proj.V      | 2145.396     | -          | -         | 1.123 |
| up_proj.U        | 0.390        | -          | -         | 0.632 |
| gate_proj.U      | 0.396        | -          | -         | 0.400 |
| down_proj.V      | 41.539       | -          | -         | 5.045 |
| down_proj.U      | 0.013        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.374 |
| v_proj.V         | 0.000        | -          | -         | 1.111 |
| q_proj.V         | 0.001        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.360 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.936 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 18.671       | -          | -         | 1.378 |
| v_proj.V         | 19.689       | -          | -         | 1.115 |
| q_proj.V         | 18.821       | -          | -         | 1.114 |
| k_proj.U         | 0.016        | -          | -         | 0.159 |
| v_proj.U         | 0.014        | -          | -         | 0.010 |
| q_proj.U         | 0.050        | -          | -         | 0.010 |
| o_proj.V         | 4.768        | -          | -         | 1.338 |
| o_proj.U         | 0.024        | -          | -         | 0.193 |
| up_proj.V        | 8.482        | -          | -         | 1.366 |
| gate_proj.V      | 8.447        | -          | -         | 1.116 |
| up_proj.U        | 0.059        | -          | -         | 0.175 |
| gate_proj.U      | 0.063        | -          | -         | 0.010 |
| down_proj.V      | 0.172        | -          | -         | 4.943 |
| down_proj.U      | 0.002        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3106.212     | -          | -         | 1.444 |
| v_proj.V         | 3169.260     | -          | -         | 1.123 |
| q_proj.V         | 3143.097     | -          | -         | 1.128 |
| k_proj.U         | 0.066        | -          | -         | 0.476 |
| v_proj.U         | 0.043        | -          | -         | 0.266 |
| q_proj.U         | 0.189        | -          | -         | 0.264 |
| o_proj.V         | 780.313      | -          | -         | 1.416 |
| o_proj.U         | 0.093        | -          | -         | 0.510 |
| up_proj.V        | 2297.171     | -          | -         | 1.436 |
| gate_proj.V      | 2271.822     | -          | -         | 1.122 |
| up_proj.U        | 0.391        | -          | -         | 0.633 |
| gate_proj.U      | 0.389        | -          | -         | 0.399 |
| down_proj.V      | 43.759       | -          | -         | 5.061 |
| down_proj.U      | 0.014        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.379 |
| v_proj.V         | 0.001        | -          | -         | 1.108 |
| q_proj.V         | 0.001        | -          | -         | 1.115 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.334 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.366 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.940 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 19.292       | -          | -         | 1.378 |
| v_proj.V         | 19.748       | -          | -         | 1.114 |
| q_proj.V         | 19.173       | -          | -         | 1.114 |
| k_proj.U         | 0.015        | -          | -         | 0.159 |
| v_proj.U         | 0.015        | -          | -         | 0.010 |
| q_proj.U         | 0.057        | -          | -         | 0.011 |
| o_proj.V         | 5.423        | -          | -         | 1.337 |
| o_proj.U         | 0.025        | -          | -         | 0.193 |
| up_proj.V        | 9.174        | -          | -         | 1.366 |
| gate_proj.V      | 9.214        | -          | -         | 1.119 |
| up_proj.U        | 0.060        | -          | -         | 0.176 |
| gate_proj.U      | 0.063        | -          | -         | 0.010 |
| down_proj.V      | 0.292        | -          | -         | 4.956 |
| down_proj.U      | 0.002        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3199.668     | -          | -         | 1.446 |
| v_proj.V         | 3242.382     | -          | -         | 1.126 |
| q_proj.V         | 3241.323     | -          | -         | 1.122 |
| k_proj.U         | 0.056        | -          | -         | 0.477 |
| v_proj.U         | 0.044        | -          | -         | 0.264 |
| q_proj.U         | 0.183        | -          | -         | 0.266 |
| o_proj.V         | 875.784      | -          | -         | 1.416 |
| o_proj.U         | 0.122        | -          | -         | 0.509 |
| up_proj.V        | 2426.601     | -          | -         | 1.435 |
| gate_proj.V      | 2401.823     | -          | -         | 1.122 |
| up_proj.U        | 0.412        | -          | -         | 0.632 |
| gate_proj.U      | 0.412        | -          | -         | 0.398 |
| down_proj.V      | 74.545       | -          | -         | 5.057 |
| down_proj.U      | 0.016        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.379 |
| v_proj.V         | 0.001        | -          | -         | 1.114 |
| q_proj.V         | 0.001        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.000        | -          | -         | 1.361 |
| gate_proj.V      | 0.000        | -          | -         | 1.113 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.951 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 19.083       | -          | -         | 1.382 |
| v_proj.V         | 20.167       | -          | -         | 1.114 |
| q_proj.V         | 18.927       | -          | -         | 1.115 |
| k_proj.U         | 0.016        | -          | -         | 0.159 |
| v_proj.U         | 0.013        | -          | -         | 0.010 |
| q_proj.U         | 0.058        | -          | -         | 0.010 |
| o_proj.V         | 5.105        | -          | -         | 1.338 |
| o_proj.U         | 0.028        | -          | -         | 0.193 |
| up_proj.V        | 10.013       | -          | -         | 1.365 |
| gate_proj.V      | 9.847        | -          | -         | 1.114 |
| up_proj.U        | 0.063        | -          | -         | 0.176 |
| gate_proj.U      | 0.062        | -          | -         | 0.010 |
| down_proj.V      | 0.286        | -          | -         | 4.947 |
| down_proj.U      | 0.002        | -          | -         | 0.189 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3108.987     | -          | -         | 1.442 |
| v_proj.V         | 3187.749     | -          | -         | 1.123 |
| q_proj.V         | 3144.959     | -          | -         | 1.121 |
| k_proj.U         | 0.055        | -          | -         | 0.476 |
| v_proj.U         | 0.043        | -          | -         | 0.267 |
| q_proj.U         | 0.188        | -          | -         | 0.266 |
| o_proj.V         | 869.262      | -          | -         | 1.413 |
| o_proj.U         | 0.125        | -          | -         | 0.509 |
| up_proj.V        | 2619.905     | -          | -         | 1.429 |
| gate_proj.V      | 2587.519     | -          | -         | 1.125 |
| up_proj.U        | 0.399        | -          | -         | 0.632 |
| gate_proj.U      | 0.401        | -          | -         | 0.399 |
| down_proj.V      | 66.170       | -          | -         | 5.046 |
| down_proj.U      | 0.017        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.379 |
| v_proj.V         | 0.001        | -          | -         | 1.111 |
| q_proj.V         | 0.001        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.365 |
| gate_proj.V      | 0.000        | -          | -         | 1.115 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.942 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 21.592       | -          | -         | 1.377 |
| v_proj.V         | 23.047       | -          | -         | 1.115 |
| q_proj.V         | 21.555       | -          | -         | 1.115 |
| k_proj.U         | 0.014        | -          | -         | 0.159 |
| v_proj.U         | 0.015        | -          | -         | 0.010 |
| q_proj.U         | 0.060        | -          | -         | 0.010 |
| o_proj.V         | 6.760        | -          | -         | 1.339 |
| o_proj.U         | 0.031        | -          | -         | 0.193 |
| up_proj.V        | 10.398       | -          | -         | 1.362 |
| gate_proj.V      | 10.335       | -          | -         | 1.116 |
| up_proj.U        | 0.061        | -          | -         | 0.176 |
| gate_proj.U      | 0.060        | -          | -         | 0.010 |
| down_proj.V      | 0.460        | -          | -         | 4.957 |
| down_proj.U      | 0.003        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3496.797     | -          | -         | 1.438 |
| v_proj.V         | 3583.746     | -          | -         | 1.115 |
| q_proj.V         | 3556.294     | -          | -         | 1.117 |
| k_proj.U         | 0.066        | -          | -         | 0.475 |
| v_proj.U         | 0.043        | -          | -         | 0.263 |
| q_proj.U         | 0.195        | -          | -         | 0.261 |
| o_proj.V         | 1066.883     | -          | -         | 1.412 |
| o_proj.U         | 0.142        | -          | -         | 0.510 |
| up_proj.V        | 2652.614     | -          | -         | 1.439 |
| gate_proj.V      | 2622.664     | -          | -         | 1.124 |
| up_proj.U        | 0.380        | -          | -         | 0.632 |
| gate_proj.U      | 0.378        | -          | -         | 0.400 |
| down_proj.V      | 86.136       | -          | -         | 5.063 |
| down_proj.U      | 0.020        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.382 |
| v_proj.V         | 0.001        | -          | -         | 1.111 |
| q_proj.V         | 0.001        | -          | -         | 1.116 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.366 |
| gate_proj.V      | 0.000        | -          | -         | 1.116 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.936 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 21.027       | -          | -         | 1.377 |
| v_proj.V         | 22.474       | -          | -         | 1.117 |
| q_proj.V         | 21.355       | -          | -         | 1.117 |
| k_proj.U         | 0.019        | -          | -         | 0.159 |
| v_proj.U         | 0.011        | -          | -         | 0.010 |
| q_proj.U         | 0.050        | -          | -         | 0.010 |
| o_proj.V         | 5.183        | -          | -         | 1.341 |
| o_proj.U         | 0.030        | -          | -         | 0.193 |
| up_proj.V        | 11.466       | -          | -         | 1.363 |
| gate_proj.V      | 11.405       | -          | -         | 1.119 |
| up_proj.U        | 0.072        | -          | -         | 0.176 |
| gate_proj.U      | 0.071        | -          | -         | 0.010 |
| down_proj.V      | 0.459        | -          | -         | 4.961 |
| down_proj.U      | 0.005        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3450.305     | -          | -         | 1.451 |
| v_proj.V         | 3526.847     | -          | -         | 1.126 |
| q_proj.V         | 3516.040     | -          | -         | 1.124 |
| k_proj.U         | 0.057        | -          | -         | 0.475 |
| v_proj.U         | 0.045        | -          | -         | 0.265 |
| q_proj.U         | 0.180        | -          | -         | 0.265 |
| o_proj.V         | 915.500      | -          | -         | 1.414 |
| o_proj.U         | 0.154        | -          | -         | 0.510 |
| up_proj.V        | 3013.833     | -          | -         | 1.433 |
| gate_proj.V      | 2982.624     | -          | -         | 1.122 |
| up_proj.U        | 0.441        | -          | -         | 0.634 |
| gate_proj.U      | 0.448        | -          | -         | 0.399 |
| down_proj.V      | 109.006      | -          | -         | 5.066 |
| down_proj.U      | 0.034        | -          | -         | 0.651 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.381 |
| v_proj.V         | 0.001        | -          | -         | 1.113 |
| q_proj.V         | 0.001        | -          | -         | 1.116 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.341 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.366 |
| gate_proj.V      | 0.000        | -          | -         | 1.124 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.962 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 20.669       | -          | -         | 1.378 |
| v_proj.V         | 22.207       | -          | -         | 1.116 |
| q_proj.V         | 20.729       | -          | -         | 1.116 |
| k_proj.U         | 0.021        | -          | -         | 0.160 |
| v_proj.U         | 0.011        | -          | -         | 0.010 |
| q_proj.U         | 0.068        | -          | -         | 0.010 |
| o_proj.V         | 13.913       | -          | -         | 1.341 |
| o_proj.U         | 0.038        | -          | -         | 0.193 |
| up_proj.V        | 12.630       | -          | -         | 1.366 |
| gate_proj.V      | 12.390       | -          | -         | 1.116 |
| up_proj.U        | 0.083        | -          | -         | 0.176 |
| gate_proj.U      | 0.080        | -          | -         | 0.010 |
| down_proj.V      | 1.060        | -          | -         | 4.958 |
| down_proj.U      | 0.009        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3310.400     | -          | -         | 1.443 |
| v_proj.V         | 3385.942     | -          | -         | 1.121 |
| q_proj.V         | 3340.780     | -          | -         | 1.124 |
| k_proj.U         | 0.057        | -          | -         | 0.475 |
| v_proj.U         | 0.037        | -          | -         | 0.265 |
| q_proj.U         | 0.168        | -          | -         | 0.265 |
| o_proj.V         | 2064.014     | -          | -         | 1.413 |
| o_proj.U         | 0.222        | -          | -         | 0.510 |
| up_proj.V        | 3247.980     | -          | -         | 1.434 |
| gate_proj.V      | 3216.262     | -          | -         | 1.124 |
| up_proj.U        | 0.492        | -          | -         | 0.632 |
| gate_proj.U      | 0.504        | -          | -         | 0.399 |
| down_proj.V      | 261.129      | -          | -         | 5.031 |
| down_proj.U      | 0.058        | -          | -         | 0.647 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.380 |
| v_proj.V         | 0.001        | -          | -         | 1.104 |
| q_proj.V         | 0.001        | -          | -         | 1.109 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.356 |
| gate_proj.V      | 0.000        | -          | -         | 1.107 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 4.926 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 23.065       | -          | -         | 1.370 |
| v_proj.V         | 25.592       | -          | -         | 1.109 |
| q_proj.V         | 23.232       | -          | -         | 1.110 |
| k_proj.U         | 0.020        | -          | -         | 0.159 |
| v_proj.U         | 0.022        | -          | -         | 0.010 |
| q_proj.U         | 0.061        | -          | -         | 0.010 |
| o_proj.V         | 8.968        | -          | -         | 1.343 |
| o_proj.U         | 0.044        | -          | -         | 0.193 |
| up_proj.V        | 13.327       | -          | -         | 1.360 |
| gate_proj.V      | 13.284       | -          | -         | 1.111 |
| up_proj.U        | 0.089        | -          | -         | 0.176 |
| gate_proj.U      | 0.090        | -          | -         | 0.010 |
| down_proj.V      | 2.371        | -          | -         | 4.931 |
| down_proj.U      | 0.026        | -          | -         | 0.189 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3763.502     | -          | -         | 1.442 |
| v_proj.V         | 3900.622     | -          | -         | 1.116 |
| q_proj.V         | 3847.462     | -          | -         | 1.120 |
| k_proj.U         | 0.073        | -          | -         | 0.475 |
| v_proj.U         | 0.049        | -          | -         | 0.262 |
| q_proj.U         | 0.221        | -          | -         | 0.271 |
| o_proj.V         | 1800.292     | -          | -         | 1.419 |
| o_proj.U         | 0.261        | -          | -         | 0.508 |
| up_proj.V        | 3508.158     | -          | -         | 1.432 |
| gate_proj.V      | 3482.188     | -          | -         | 1.116 |
| up_proj.U        | 0.500        | -          | -         | 0.631 |
| gate_proj.U      | 0.520        | -          | -         | 0.397 |
| down_proj.V      | 606.517      | -          | -         | 5.054 |
| down_proj.U      | 0.105        | -          | -         | 0.648 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.373 |
| v_proj.V         | 0.000        | -          | -         | 1.106 |
| q_proj.V         | 0.000        | -          | -         | 1.106 |
| k_proj.U         | 0.000        | -          | -         | 0.152 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.000        | -          | -         | 1.358 |
| gate_proj.V      | 0.000        | -          | -         | 1.108 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 4.950 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 12.861       | -          | -         | 1.370 |
| v_proj.V         | 13.577       | -          | -         | 1.107 |
| q_proj.V         | 12.820       | -          | -         | 1.106 |
| k_proj.U         | 0.015        | -          | -         | 0.159 |
| v_proj.U         | 0.007        | -          | -         | 0.010 |
| q_proj.U         | 0.046        | -          | -         | 0.010 |
| o_proj.V         | 8.879        | -          | -         | 1.331 |
| o_proj.U         | 0.062        | -          | -         | 0.193 |
| up_proj.V        | 11.166       | -          | -         | 1.358 |
| gate_proj.V      | 11.137       | -          | -         | 1.111 |
| up_proj.U        | 0.075        | -          | -         | 0.176 |
| gate_proj.U      | 0.075        | -          | -         | 0.010 |
| down_proj.V      | 6.108        | -          | -         | 4.934 |
| down_proj.U      | 0.066        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 1804.537     | -          | -         | 1.441 |
| v_proj.V         | 1859.743     | -          | -         | 1.118 |
| q_proj.V         | 1844.488     | -          | -         | 1.117 |
| k_proj.U         | 0.028        | -          | -         | 0.475 |
| v_proj.U         | 0.028        | -          | -         | 0.262 |
| q_proj.U         | 0.092        | -          | -         | 0.263 |
| o_proj.V         | 1340.941     | -          | -         | 1.404 |
| o_proj.U         | 0.172        | -          | -         | 0.508 |
| up_proj.V        | 2767.199     | -          | -         | 1.430 |
| gate_proj.V      | 2769.042     | -          | -         | 1.115 |
| up_proj.U        | 0.418        | -          | -         | 0.631 |
| gate_proj.U      | 0.433        | -          | -         | 0.396 |
| down_proj.V      | 752.435      | -          | -         | 5.042 |
| down_proj.U      | 0.093        | -          | -         | 0.647 |
+------------------+--------------+------------+-----------+-------+


4281.673069477081
