factor 8.0
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]Loading checkpoint shards:  25%|██▌       | 1/4 [00:00<00:01,  2.27it/s]Loading checkpoint shards:  50%|█████     | 2/4 [00:00<00:00,  2.32it/s]Loading checkpoint shards:  75%|███████▌  | 3/4 [00:01<00:00,  2.23it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.98it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:01<00:00,  2.67it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Starting ...
Ready.
Quantizing 8bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.833 |
| v_proj.V         | 0.001        | -          | -         | 1.074 |
| q_proj.V         | 0.002        | -          | -         | 1.078 |
| k_proj.U         | 0.000        | -          | -         | 0.467 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.728 |
| o_proj.U         | 0.000        | -          | -         | 0.544 |
| up_proj.V        | 0.002        | -          | -         | 1.754 |
| gate_proj.V      | 0.002        | -          | -         | 1.076 |
| up_proj.U        | 0.000        | -          | -         | 0.491 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.759 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 54.399       | -          | -         | 1.788 |
| v_proj.V         | 47.571       | -          | -         | 1.076 |
| q_proj.V         | 62.009       | -          | -         | 1.079 |
| k_proj.U         | 0.146        | -          | -         | 0.485 |
| v_proj.U         | 0.064        | -          | -         | 0.010 |
| q_proj.U         | 0.533        | -          | -         | 0.010 |
| o_proj.V         | 0.417        | -          | -         | 1.734 |
| o_proj.U         | 0.005        | -          | -         | 0.558 |
| up_proj.V        | 26.436       | -          | -         | 1.763 |
| gate_proj.V      | 26.578       | -          | -         | 1.079 |
| up_proj.U        | 0.269        | -          | -         | 0.502 |
| gate_proj.U      | 0.261        | -          | -         | 0.010 |
| down_proj.V      | 0.178        | -          | -         | 5.782 |
| down_proj.U      | 0.001        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2340.875     | -          | -         | 1.814 |
| v_proj.V         | 1548.295     | -          | -         | 1.083 |
| q_proj.V         | 2159.129     | -          | -         | 1.081 |
| k_proj.U         | 0.294        | -          | -         | 0.554 |
| v_proj.U         | 0.299        | -          | -         | 0.059 |
| q_proj.U         | 1.410        | -          | -         | 0.060 |
| o_proj.V         | 8.407        | -          | -         | 1.757 |
| o_proj.U         | 0.016        | -          | -         | 0.628 |
| up_proj.V        | 1220.714     | -          | -         | 1.787 |
| gate_proj.V      | 1227.545     | -          | -         | 1.084 |
| up_proj.U        | 3.397        | -          | -         | 0.605 |
| gate_proj.U      | 3.339        | -          | -         | 0.090 |
| down_proj.V      | 6.012        | -          | -         | 5.824 |
| down_proj.U      | 0.024        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.001        | -          | -         | 1.791 |
| v_proj.V         | 0.001        | -          | -         | 1.081 |
| q_proj.V         | 0.001        | -          | -         | 1.081 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.000        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.735 |
| o_proj.U         | 0.000        | -          | -         | 0.547 |
| up_proj.V        | 0.002        | -          | -         | 1.762 |
| gate_proj.V      | 0.002        | -          | -         | 1.075 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.775 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 25.471       | -          | -         | 1.789 |
| v_proj.V         | 31.255       | -          | -         | 1.076 |
| q_proj.V         | 25.874       | -          | -         | 1.080 |
| k_proj.U         | 0.202        | -          | -         | 0.484 |
| v_proj.U         | 0.100        | -          | -         | 0.010 |
| q_proj.U         | 0.980        | -          | -         | 0.010 |
| o_proj.V         | 0.468        | -          | -         | 1.727 |
| o_proj.U         | 0.007        | -          | -         | 0.556 |
| up_proj.V        | 23.811       | -          | -         | 1.761 |
| gate_proj.V      | 24.233       | -          | -         | 1.077 |
| up_proj.U        | 0.172        | -          | -         | 0.501 |
| gate_proj.U      | 0.174        | -          | -         | 0.010 |
| down_proj.V      | 5.399        | -          | -         | 5.773 |
| down_proj.U      | 0.001        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 984.705      | -          | -         | 1.814 |
| v_proj.V         | 909.733      | -          | -         | 1.086 |
| q_proj.V         | 1022.099     | -          | -         | 1.085 |
| k_proj.U         | 0.272        | -          | -         | 0.553 |
| v_proj.U         | 0.446        | -          | -         | 0.059 |
| q_proj.U         | 1.286        | -          | -         | 0.059 |
| o_proj.V         | 10.604       | -          | -         | 1.759 |
| o_proj.U         | 0.027        | -          | -         | 0.627 |
| up_proj.V        | 1178.222     | -          | -         | 1.793 |
| gate_proj.V      | 1177.986     | -          | -         | 1.086 |
| up_proj.U        | 3.660        | -          | -         | 0.605 |
| gate_proj.U      | 3.655        | -          | -         | 0.091 |
| down_proj.V      | 11.523       | -          | -         | 5.801 |
| down_proj.U      | 0.010        | -          | -         | 0.663 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.787 |
| v_proj.V         | 0.003        | -          | -         | 1.073 |
| q_proj.V         | 0.004        | -          | -         | 1.077 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.732 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.756 |
| gate_proj.V      | 0.002        | -          | -         | 1.075 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.764 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 84.225       | -          | -         | 1.787 |
| v_proj.V         | 85.044       | -          | -         | 1.077 |
| q_proj.V         | 83.303       | -          | -         | 1.078 |
| k_proj.U         | 0.544        | -          | -         | 0.484 |
| v_proj.U         | 0.156        | -          | -         | 0.010 |
| q_proj.U         | 1.034        | -          | -         | 0.010 |
| o_proj.V         | 0.187        | -          | -         | 1.734 |
| o_proj.U         | 0.005        | -          | -         | 0.557 |
| up_proj.V        | 31.312       | -          | -         | 1.761 |
| gate_proj.V      | 31.879       | -          | -         | 1.076 |
| up_proj.U        | 0.279        | -          | -         | 0.502 |
| gate_proj.U      | 0.284        | -          | -         | 0.010 |
| down_proj.V      | 0.123        | -          | -         | 5.780 |
| down_proj.U      | 0.001        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3632.124     | -          | -         | 1.811 |
| v_proj.V         | 3637.399     | -          | -         | 1.082 |
| q_proj.V         | 3805.480     | -          | -         | 1.083 |
| k_proj.U         | 1.684        | -          | -         | 0.553 |
| v_proj.U         | 1.832        | -          | -         | 0.059 |
| q_proj.U         | 6.530        | -          | -         | 0.060 |
| o_proj.V         | 9.604        | -          | -         | 1.759 |
| o_proj.U         | 0.032        | -          | -         | 0.627 |
| up_proj.V        | 1887.344     | -          | -         | 1.787 |
| gate_proj.V      | 1882.859     | -          | -         | 1.084 |
| up_proj.U        | 5.235        | -          | -         | 0.605 |
| gate_proj.U      | 5.236        | -          | -         | 0.091 |
| down_proj.V      | 8.094        | -          | -         | 5.824 |
| down_proj.U      | 0.032        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.787 |
| v_proj.V         | 0.004        | -          | -         | 1.078 |
| q_proj.V         | 0.004        | -          | -         | 1.074 |
| k_proj.U         | 0.000        | -          | -         | 0.470 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.734 |
| o_proj.U         | 0.000        | -          | -         | 0.547 |
| up_proj.V        | 0.002        | -          | -         | 1.762 |
| gate_proj.V      | 0.002        | -          | -         | 1.075 |
| up_proj.U        | 0.000        | -          | -         | 0.493 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.753 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 89.688       | -          | -         | 1.787 |
| v_proj.V         | 91.158       | -          | -         | 1.076 |
| q_proj.V         | 90.196       | -          | -         | 1.077 |
| k_proj.U         | 0.555        | -          | -         | 0.484 |
| v_proj.U         | 0.126        | -          | -         | 0.010 |
| q_proj.U         | 1.153        | -          | -         | 0.010 |
| o_proj.V         | 0.210        | -          | -         | 1.732 |
| o_proj.U         | 0.006        | -          | -         | 0.556 |
| up_proj.V        | 39.588       | -          | -         | 1.758 |
| gate_proj.V      | 40.502       | -          | -         | 1.076 |
| up_proj.U        | 0.420        | -          | -         | 0.501 |
| gate_proj.U      | 0.425        | -          | -         | 0.010 |
| down_proj.V      | 0.206        | -          | -         | 5.774 |
| down_proj.U      | 0.002        | -          | -         | 0.559 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4057.560     | -          | -         | 1.812 |
| v_proj.V         | 4070.646     | -          | -         | 1.083 |
| q_proj.V         | 4091.460     | -          | -         | 1.079 |
| k_proj.U         | 1.774        | -          | -         | 0.553 |
| v_proj.U         | 1.819        | -          | -         | 0.059 |
| q_proj.U         | 6.713        | -          | -         | 0.059 |
| o_proj.V         | 10.621       | -          | -         | 1.757 |
| o_proj.U         | 0.058        | -          | -         | 0.627 |
| up_proj.V        | 2665.212     | -          | -         | 1.788 |
| gate_proj.V      | 2661.712     | -          | -         | 1.086 |
| up_proj.U        | 8.489        | -          | -         | 0.604 |
| gate_proj.U      | 8.442        | -          | -         | 0.091 |
| down_proj.V      | 14.416       | -          | -         | 5.809 |
| down_proj.U      | 0.053        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.787 |
| v_proj.V         | 0.003        | -          | -         | 1.076 |
| q_proj.V         | 0.003        | -          | -         | 1.073 |
| k_proj.U         | 0.000        | -          | -         | 0.470 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.005        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.731 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.759 |
| gate_proj.V      | 0.003        | -          | -         | 1.074 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.761 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 76.601       | -          | -         | 1.786 |
| v_proj.V         | 78.590       | -          | -         | 1.074 |
| q_proj.V         | 76.746       | -          | -         | 1.078 |
| k_proj.U         | 0.590        | -          | -         | 0.484 |
| v_proj.U         | 0.149        | -          | -         | 0.010 |
| q_proj.U         | 1.126        | -          | -         | 0.010 |
| o_proj.V         | 0.772        | -          | -         | 1.735 |
| o_proj.U         | 0.020        | -          | -         | 0.557 |
| up_proj.V        | 46.964       | -          | -         | 1.753 |
| gate_proj.V      | 46.773       | -          | -         | 1.068 |
| up_proj.U        | 0.647        | -          | -         | 0.501 |
| gate_proj.U      | 0.679        | -          | -         | 0.010 |
| down_proj.V      | 0.375        | -          | -         | 5.778 |
| down_proj.U      | 0.005        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3385.303     | -          | -         | 1.807 |
| v_proj.V         | 3389.284     | -          | -         | 1.082 |
| q_proj.V         | 3423.664     | -          | -         | 1.082 |
| k_proj.U         | 1.529        | -          | -         | 0.553 |
| v_proj.U         | 1.609        | -          | -         | 0.059 |
| q_proj.U         | 6.091        | -          | -         | 0.059 |
| o_proj.V         | 33.611       | -          | -         | 1.758 |
| o_proj.U         | 0.162        | -          | -         | 0.628 |
| up_proj.V        | 3023.239     | -          | -         | 1.789 |
| gate_proj.V      | 3018.639     | -          | -         | 1.084 |
| up_proj.U        | 11.009       | -          | -         | 0.605 |
| gate_proj.U      | 11.055       | -          | -         | 0.091 |
| down_proj.V      | 24.989       | -          | -         | 5.807 |
| down_proj.U      | 0.108        | -          | -         | 0.663 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.786 |
| v_proj.V         | 0.005        | -          | -         | 1.077 |
| q_proj.V         | 0.005        | -          | -         | 1.075 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.005        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.730 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.003        | -          | -         | 1.761 |
| gate_proj.V      | 0.003        | -          | -         | 1.077 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.763 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 108.019      | -          | -         | 1.792 |
| v_proj.V         | 106.960      | -          | -         | 1.078 |
| q_proj.V         | 105.713      | -          | -         | 1.079 |
| k_proj.U         | 0.685        | -          | -         | 0.485 |
| v_proj.U         | 0.237        | -          | -         | 0.010 |
| q_proj.U         | 1.532        | -          | -         | 0.010 |
| o_proj.V         | 1.014        | -          | -         | 1.824 |
| o_proj.U         | 0.033        | -          | -         | 0.557 |
| up_proj.V        | 51.038       | -          | -         | 1.763 |
| gate_proj.V      | 52.088       | -          | -         | 1.081 |
| up_proj.U        | 0.733        | -          | -         | 0.502 |
| gate_proj.U      | 0.804        | -          | -         | 0.010 |
| down_proj.V      | 0.464        | -          | -         | 5.776 |
| down_proj.U      | 0.006        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4551.669     | -          | -         | 1.812 |
| v_proj.V         | 4575.530     | -          | -         | 1.076 |
| q_proj.V         | 4665.124     | -          | -         | 1.082 |
| k_proj.U         | 2.616        | -          | -         | 0.554 |
| v_proj.U         | 2.437        | -          | -         | 0.059 |
| q_proj.U         | 10.251       | -          | -         | 0.059 |
| o_proj.V         | 44.473       | -          | -         | 1.750 |
| o_proj.U         | 0.265        | -          | -         | 0.628 |
| up_proj.V        | 3217.153     | -          | -         | 1.781 |
| gate_proj.V      | 3216.515     | -          | -         | 1.087 |
| up_proj.U        | 12.076       | -          | -         | 0.606 |
| gate_proj.U      | 12.030       | -          | -         | 0.091 |
| down_proj.V      | 27.785       | -          | -         | 5.809 |
| down_proj.U      | 0.122        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.787 |
| v_proj.V         | 0.004        | -          | -         | 1.078 |
| q_proj.V         | 0.004        | -          | -         | 1.076 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.735 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.762 |
| gate_proj.V      | 0.002        | -          | -         | 1.078 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.762 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 89.825       | -          | -         | 1.785 |
| v_proj.V         | 90.739       | -          | -         | 1.075 |
| q_proj.V         | 88.609       | -          | -         | 1.079 |
| k_proj.U         | 0.616        | -          | -         | 0.485 |
| v_proj.U         | 0.177        | -          | -         | 0.010 |
| q_proj.U         | 1.353        | -          | -         | 0.010 |
| o_proj.V         | 1.193        | -          | -         | 1.742 |
| o_proj.U         | 0.047        | -          | -         | 0.558 |
| up_proj.V        | 47.953       | -          | -         | 1.758 |
| gate_proj.V      | 48.067       | -          | -         | 1.072 |
| up_proj.U        | 0.793        | -          | -         | 0.502 |
| gate_proj.U      | 0.914        | -          | -         | 0.010 |
| down_proj.V      | 0.518        | -          | -         | 5.769 |
| down_proj.U      | 0.008        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3713.963     | -          | -         | 1.815 |
| v_proj.V         | 3728.169     | -          | -         | 1.081 |
| q_proj.V         | 3751.398     | -          | -         | 1.085 |
| k_proj.U         | 2.261        | -          | -         | 0.554 |
| v_proj.U         | 1.930        | -          | -         | 0.059 |
| q_proj.U         | 8.633        | -          | -         | 0.059 |
| o_proj.V         | 51.411       | -          | -         | 1.759 |
| o_proj.U         | 0.315        | -          | -         | 0.628 |
| up_proj.V        | 3068.350     | -          | -         | 1.786 |
| gate_proj.V      | 3082.491     | -          | -         | 1.076 |
| up_proj.U        | 12.193       | -          | -         | 0.605 |
| gate_proj.U      | 12.316       | -          | -         | 0.096 |
| down_proj.V      | 32.801       | -          | -         | 5.825 |
| down_proj.U      | 0.158        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.791 |
| v_proj.V         | 0.003        | -          | -         | 1.081 |
| q_proj.V         | 0.004        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.739 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.765 |
| gate_proj.V      | 0.002        | -          | -         | 1.080 |
| up_proj.U        | 0.001        | -          | -         | 0.495 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.767 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 85.530       | -          | -         | 1.791 |
| v_proj.V         | 86.056       | -          | -         | 1.078 |
| q_proj.V         | 84.580       | -          | -         | 1.078 |
| k_proj.U         | 0.683        | -          | -         | 0.485 |
| v_proj.U         | 0.196        | -          | -         | 0.010 |
| q_proj.U         | 1.203        | -          | -         | 0.010 |
| o_proj.V         | 1.762        | -          | -         | 1.735 |
| o_proj.U         | 0.117        | -          | -         | 0.557 |
| up_proj.V        | 45.234       | -          | -         | 1.765 |
| gate_proj.V      | 46.100       | -          | -         | 1.082 |
| up_proj.U        | 0.885        | -          | -         | 0.501 |
| gate_proj.U      | 1.061        | -          | -         | 0.010 |
| down_proj.V      | 0.569        | -          | -         | 5.775 |
| down_proj.U      | 0.014        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3681.101     | -          | -         | 1.810 |
| v_proj.V         | 3745.771     | -          | -         | 1.085 |
| q_proj.V         | 3740.650     | -          | -         | 1.088 |
| k_proj.U         | 2.417        | -          | -         | 0.554 |
| v_proj.U         | 2.005        | -          | -         | 0.059 |
| q_proj.U         | 9.040        | -          | -         | 0.059 |
| o_proj.V         | 67.730       | -          | -         | 1.757 |
| o_proj.U         | 0.565        | -          | -         | 0.628 |
| up_proj.V        | 3059.504     | -          | -         | 1.789 |
| gate_proj.V      | 3063.004     | -          | -         | 1.088 |
| up_proj.U        | 13.656       | -          | -         | 0.605 |
| gate_proj.U      | 13.976       | -          | -         | 0.091 |
| down_proj.V      | 36.532       | -          | -         | 5.810 |
| down_proj.U      | 0.191        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.789 |
| v_proj.V         | 0.004        | -          | -         | 1.077 |
| q_proj.V         | 0.004        | -          | -         | 1.075 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.735 |
| o_proj.U         | 0.000        | -          | -         | 0.549 |
| up_proj.V        | 0.002        | -          | -         | 1.764 |
| gate_proj.V      | 0.002        | -          | -         | 1.078 |
| up_proj.U        | 0.001        | -          | -         | 0.495 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.779 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 98.499       | -          | -         | 1.793 |
| v_proj.V         | 101.136      | -          | -         | 1.080 |
| q_proj.V         | 99.292       | -          | -         | 1.081 |
| k_proj.U         | 0.800        | -          | -         | 0.486 |
| v_proj.U         | 0.288        | -          | -         | 0.010 |
| q_proj.U         | 1.912        | -          | -         | 0.010 |
| o_proj.V         | 2.180        | -          | -         | 1.736 |
| o_proj.U         | 0.117        | -          | -         | 0.558 |
| up_proj.V        | 48.993       | -          | -         | 1.763 |
| gate_proj.V      | 49.416       | -          | -         | 1.079 |
| up_proj.U        | 0.966        | -          | -         | 0.502 |
| gate_proj.U      | 1.140        | -          | -         | 0.010 |
| down_proj.V      | 0.571        | -          | -         | 5.776 |
| down_proj.U      | 0.015        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4475.341     | -          | -         | 1.811 |
| v_proj.V         | 4474.670     | -          | -         | 1.087 |
| q_proj.V         | 4522.351     | -          | -         | 1.088 |
| k_proj.U         | 3.381        | -          | -         | 0.554 |
| v_proj.U         | 2.780        | -          | -         | 0.060 |
| q_proj.U         | 12.445       | -          | -         | 0.059 |
| o_proj.V         | 81.595       | -          | -         | 1.760 |
| o_proj.U         | 0.578        | -          | -         | 0.629 |
| up_proj.V        | 3435.469     | -          | -         | 1.796 |
| gate_proj.V      | 3442.623     | -          | -         | 1.087 |
| up_proj.U        | 14.886       | -          | -         | 0.606 |
| gate_proj.U      | 15.308       | -          | -         | 0.091 |
| down_proj.V      | 38.685       | -          | -         | 5.806 |
| down_proj.U      | 0.202        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.792 |
| v_proj.V         | 0.005        | -          | -         | 1.078 |
| q_proj.V         | 0.005        | -          | -         | 1.077 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.739 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.763 |
| gate_proj.V      | 0.002        | -          | -         | 1.078 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.765 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 105.063      | -          | -         | 1.792 |
| v_proj.V         | 105.000      | -          | -         | 1.078 |
| q_proj.V         | 104.909      | -          | -         | 1.082 |
| k_proj.U         | 0.767        | -          | -         | 0.486 |
| v_proj.U         | 0.304        | -          | -         | 0.010 |
| q_proj.U         | 1.925        | -          | -         | 0.010 |
| o_proj.V         | 1.467        | -          | -         | 1.737 |
| o_proj.U         | 0.074        | -          | -         | 0.558 |
| up_proj.V        | 52.294       | -          | -         | 1.764 |
| gate_proj.V      | 52.193       | -          | -         | 1.078 |
| up_proj.U        | 1.010        | -          | -         | 0.502 |
| gate_proj.U      | 1.311        | -          | -         | 0.010 |
| down_proj.V      | 0.708        | -          | -         | 5.765 |
| down_proj.U      | 0.015        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4694.759     | -          | -         | 1.820 |
| v_proj.V         | 4720.847     | -          | -         | 1.083 |
| q_proj.V         | 4704.252     | -          | -         | 1.089 |
| k_proj.U         | 3.569        | -          | -         | 0.554 |
| v_proj.U         | 2.622        | -          | -         | 0.059 |
| q_proj.U         | 12.743       | -          | -         | 0.059 |
| o_proj.V         | 55.472       | -          | -         | 1.757 |
| o_proj.U         | 0.507        | -          | -         | 0.629 |
| up_proj.V        | 3574.340     | -          | -         | 1.785 |
| gate_proj.V      | 3601.388     | -          | -         | 1.087 |
| up_proj.U        | 16.003       | -          | -         | 0.606 |
| gate_proj.U      | 16.458       | -          | -         | 0.091 |
| down_proj.V      | 44.414       | -          | -         | 5.817 |
| down_proj.U      | 0.236        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.791 |
| v_proj.V         | 0.005        | -          | -         | 1.079 |
| q_proj.V         | 0.006        | -          | -         | 1.081 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.737 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.764 |
| gate_proj.V      | 0.002        | -          | -         | 1.082 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.772 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 124.158      | -          | -         | 1.793 |
| v_proj.V         | 123.599      | -          | -         | 1.078 |
| q_proj.V         | 126.018      | -          | -         | 1.078 |
| k_proj.U         | 1.027        | -          | -         | 0.485 |
| v_proj.U         | 0.351        | -          | -         | 0.010 |
| q_proj.U         | 2.261        | -          | -         | 0.010 |
| o_proj.V         | 3.290        | -          | -         | 1.737 |
| o_proj.U         | 0.132        | -          | -         | 0.557 |
| up_proj.V        | 52.898       | -          | -         | 1.762 |
| gate_proj.V      | 54.066       | -          | -         | 1.077 |
| up_proj.U        | 1.056        | -          | -         | 0.502 |
| gate_proj.U      | 1.296        | -          | -         | 0.010 |
| down_proj.V      | 0.705        | -          | -         | 5.767 |
| down_proj.U      | 0.015        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5371.472     | -          | -         | 1.818 |
| v_proj.V         | 5416.498     | -          | -         | 1.080 |
| q_proj.V         | 5441.934     | -          | -         | 1.087 |
| k_proj.U         | 4.488        | -          | -         | 0.554 |
| v_proj.U         | 3.334        | -          | -         | 0.060 |
| q_proj.U         | 16.529       | -          | -         | 0.060 |
| o_proj.V         | 150.105      | -          | -         | 1.762 |
| o_proj.U         | 1.103        | -          | -         | 0.629 |
| up_proj.V        | 3631.042     | -          | -         | 1.785 |
| gate_proj.V      | 3598.160     | -          | -         | 1.088 |
| up_proj.U        | 16.555       | -          | -         | 0.606 |
| gate_proj.U      | 17.292       | -          | -         | 0.091 |
| down_proj.V      | 43.717       | -          | -         | 5.816 |
| down_proj.U      | 0.247        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.792 |
| v_proj.V         | 0.005        | -          | -         | 1.076 |
| q_proj.V         | 0.005        | -          | -         | 1.077 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.741 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.762 |
| gate_proj.V      | 0.003        | -          | -         | 1.078 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.768 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 117.574      | -          | -         | 1.791 |
| v_proj.V         | 120.251      | -          | -         | 1.077 |
| q_proj.V         | 119.935      | -          | -         | 1.079 |
| k_proj.U         | 0.963        | -          | -         | 0.485 |
| v_proj.U         | 0.350        | -          | -         | 0.010 |
| q_proj.U         | 1.904        | -          | -         | 0.010 |
| o_proj.V         | 2.177        | -          | -         | 1.734 |
| o_proj.U         | 0.138        | -          | -         | 0.557 |
| up_proj.V        | 55.217       | -          | -         | 1.764 |
| gate_proj.V      | 54.850       | -          | -         | 1.079 |
| up_proj.U        | 1.223        | -          | -         | 0.502 |
| gate_proj.U      | 1.551        | -          | -         | 0.010 |
| down_proj.V      | 0.741        | -          | -         | 5.772 |
| down_proj.U      | 0.020        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5167.354     | -          | -         | 1.812 |
| v_proj.V         | 5179.852     | -          | -         | 1.088 |
| q_proj.V         | 5187.684     | -          | -         | 1.083 |
| k_proj.U         | 4.194        | -          | -         | 0.553 |
| v_proj.U         | 3.053        | -          | -         | 0.059 |
| q_proj.U         | 14.637       | -          | -         | 0.059 |
| o_proj.V         | 91.265       | -          | -         | 1.759 |
| o_proj.U         | 0.824        | -          | -         | 0.628 |
| up_proj.V        | 3718.915     | -          | -         | 1.792 |
| gate_proj.V      | 3712.003     | -          | -         | 1.087 |
| up_proj.U        | 17.111       | -          | -         | 0.606 |
| gate_proj.U      | 18.128       | -          | -         | 0.091 |
| down_proj.V      | 43.765       | -          | -         | 5.831 |
| down_proj.U      | 0.273        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.790 |
| v_proj.V         | 0.004        | -          | -         | 1.075 |
| q_proj.V         | 0.004        | -          | -         | 1.078 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.005        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.737 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.002        | -          | -         | 1.761 |
| gate_proj.V      | 0.002        | -          | -         | 1.076 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.753 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 100.261      | -          | -         | 1.793 |
| v_proj.V         | 103.254      | -          | -         | 1.080 |
| q_proj.V         | 102.446      | -          | -         | 1.081 |
| k_proj.U         | 0.835        | -          | -         | 0.485 |
| v_proj.U         | 0.224        | -          | -         | 0.010 |
| q_proj.U         | 1.531        | -          | -         | 0.010 |
| o_proj.V         | 3.437        | -          | -         | 1.736 |
| o_proj.U         | 0.400        | -          | -         | 0.557 |
| up_proj.V        | 56.364       | -          | -         | 1.778 |
| gate_proj.V      | 56.330       | -          | -         | 1.095 |
| up_proj.U        | 1.333        | -          | -         | 0.502 |
| gate_proj.U      | 1.656        | -          | -         | 0.010 |
| down_proj.V      | 0.961        | -          | -         | 5.773 |
| down_proj.U      | 0.023        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4564.968     | -          | -         | 1.815 |
| v_proj.V         | 4649.000     | -          | -         | 1.084 |
| q_proj.V         | 4596.742     | -          | -         | 1.089 |
| k_proj.U         | 3.352        | -          | -         | 0.554 |
| v_proj.U         | 2.642        | -          | -         | 0.059 |
| q_proj.U         | 11.960       | -          | -         | 0.059 |
| o_proj.V         | 154.735      | -          | -         | 1.762 |
| o_proj.U         | 2.041        | -          | -         | 0.629 |
| up_proj.V        | 3831.549     | -          | -         | 1.799 |
| gate_proj.V      | 3811.003     | -          | -         | 1.087 |
| up_proj.U        | 19.835       | -          | -         | 0.606 |
| gate_proj.U      | 21.128       | -          | -         | 0.091 |
| down_proj.V      | 59.183       | -          | -         | 5.816 |
| down_proj.U      | 0.373        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.788 |
| v_proj.V         | 0.006        | -          | -         | 1.078 |
| q_proj.V         | 0.007        | -          | -         | 1.081 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.739 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.003        | -          | -         | 1.765 |
| gate_proj.V      | 0.003        | -          | -         | 1.077 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.767 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 141.155      | -          | -         | 1.793 |
| v_proj.V         | 142.830      | -          | -         | 1.081 |
| q_proj.V         | 142.179      | -          | -         | 1.083 |
| k_proj.U         | 1.066        | -          | -         | 0.485 |
| v_proj.U         | 0.404        | -          | -         | 0.010 |
| q_proj.U         | 2.332        | -          | -         | 0.010 |
| o_proj.V         | 2.905        | -          | -         | 1.740 |
| o_proj.U         | 0.203        | -          | -         | 0.558 |
| up_proj.V        | 62.603       | -          | -         | 1.768 |
| gate_proj.V      | 62.467       | -          | -         | 1.079 |
| up_proj.U        | 1.464        | -          | -         | 0.502 |
| gate_proj.U      | 1.832        | -          | -         | 0.010 |
| down_proj.V      | 1.150        | -          | -         | 5.776 |
| down_proj.U      | 0.027        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6214.441     | -          | -         | 1.813 |
| v_proj.V         | 6255.796     | -          | -         | 1.082 |
| q_proj.V         | 6234.882     | -          | -         | 1.084 |
| k_proj.U         | 5.712        | -          | -         | 0.553 |
| v_proj.U         | 3.989        | -          | -         | 0.059 |
| q_proj.U         | 19.178       | -          | -         | 0.059 |
| o_proj.V         | 112.169      | -          | -         | 1.763 |
| o_proj.U         | 0.978        | -          | -         | 0.628 |
| up_proj.V        | 4150.667     | -          | -         | 1.788 |
| gate_proj.V      | 4155.480     | -          | -         | 1.086 |
| up_proj.U        | 21.596       | -          | -         | 0.605 |
| gate_proj.U      | 22.728       | -          | -         | 0.091 |
| down_proj.V      | 65.713       | -          | -         | 5.817 |
| down_proj.U      | 0.437        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.792 |
| v_proj.V         | 0.006        | -          | -         | 1.079 |
| q_proj.V         | 0.007        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.005        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.738 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.003        | -          | -         | 1.767 |
| gate_proj.V      | 0.003        | -          | -         | 1.081 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.774 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 150.564      | -          | -         | 1.793 |
| v_proj.V         | 148.408      | -          | -         | 1.079 |
| q_proj.V         | 153.594      | -          | -         | 1.079 |
| k_proj.U         | 1.091        | -          | -         | 0.485 |
| v_proj.U         | 0.404        | -          | -         | 0.010 |
| q_proj.U         | 2.438        | -          | -         | 0.010 |
| o_proj.V         | 6.327        | -          | -         | 1.739 |
| o_proj.U         | 0.239        | -          | -         | 0.557 |
| up_proj.V        | 71.337       | -          | -         | 1.765 |
| gate_proj.V      | 71.560       | -          | -         | 1.081 |
| up_proj.U        | 1.575        | -          | -         | 0.502 |
| gate_proj.U      | 1.969        | -          | -         | 0.010 |
| down_proj.V      | 2.677        | -          | -         | 5.784 |
| down_proj.U      | 0.040        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7093.884     | -          | -         | 1.816 |
| v_proj.V         | 7150.815     | -          | -         | 1.084 |
| q_proj.V         | 7119.171     | -          | -         | 1.085 |
| k_proj.U         | 6.074        | -          | -         | 0.554 |
| v_proj.U         | 4.359        | -          | -         | 0.059 |
| q_proj.U         | 19.085       | -          | -         | 0.060 |
| o_proj.V         | 332.261      | -          | -         | 1.767 |
| o_proj.U         | 2.048        | -          | -         | 0.628 |
| up_proj.V        | 5199.036     | -          | -         | 1.795 |
| gate_proj.V      | 5174.145     | -          | -         | 1.090 |
| up_proj.U        | 25.168       | -          | -         | 0.607 |
| gate_proj.U      | 26.663       | -          | -         | 0.091 |
| down_proj.V      | 200.494      | -          | -         | 5.828 |
| down_proj.U      | 1.168        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.792 |
| v_proj.V         | 0.007        | -          | -         | 1.076 |
| q_proj.V         | 0.007        | -          | -         | 1.078 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.733 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.004        | -          | -         | 1.761 |
| gate_proj.V      | 0.004        | -          | -         | 1.074 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.775 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 168.696      | -          | -         | 1.791 |
| v_proj.V         | 169.674      | -          | -         | 1.076 |
| q_proj.V         | 171.151      | -          | -         | 1.080 |
| k_proj.U         | 1.162        | -          | -         | 0.486 |
| v_proj.U         | 0.338        | -          | -         | 0.010 |
| q_proj.U         | 2.285        | -          | -         | 0.010 |
| o_proj.V         | 8.549        | -          | -         | 1.740 |
| o_proj.U         | 0.190        | -          | -         | 0.557 |
| up_proj.V        | 98.532       | -          | -         | 1.762 |
| gate_proj.V      | 99.381       | -          | -         | 1.079 |
| up_proj.U        | 1.770        | -          | -         | 0.502 |
| gate_proj.U      | 2.229        | -          | -         | 0.010 |
| down_proj.V      | 3.344        | -          | -         | 5.783 |
| down_proj.U      | 0.049        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 8395.248     | -          | -         | 1.813 |
| v_proj.V         | 8429.178     | -          | -         | 1.086 |
| q_proj.V         | 8439.730     | -          | -         | 1.086 |
| k_proj.U         | 6.930        | -          | -         | 0.554 |
| v_proj.U         | 4.778        | -          | -         | 0.060 |
| q_proj.U         | 21.347       | -          | -         | 0.060 |
| o_proj.V         | 345.524      | -          | -         | 1.763 |
| o_proj.U         | 2.667        | -          | -         | 0.629 |
| up_proj.V        | 7342.420     | -          | -         | 1.790 |
| gate_proj.V      | 7316.942     | -          | -         | 1.084 |
| up_proj.U        | 34.029       | -          | -         | 0.606 |
| gate_proj.U      | 36.239       | -          | -         | 0.091 |
| down_proj.V      | 241.996      | -          | -         | 5.820 |
| down_proj.U      | 1.310        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.788 |
| v_proj.V         | 0.008        | -          | -         | 1.073 |
| q_proj.V         | 0.009        | -          | -         | 1.074 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.732 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.005        | -          | -         | 1.763 |
| gate_proj.V      | 0.005        | -          | -         | 1.077 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.757 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 198.163      | -          | -         | 1.793 |
| v_proj.V         | 198.051      | -          | -         | 1.080 |
| q_proj.V         | 201.720      | -          | -         | 1.080 |
| k_proj.U         | 1.034        | -          | -         | 0.485 |
| v_proj.U         | 0.327        | -          | -         | 0.010 |
| q_proj.U         | 2.944        | -          | -         | 0.010 |
| o_proj.V         | 4.492        | -          | -         | 1.740 |
| o_proj.U         | 0.193        | -          | -         | 0.558 |
| up_proj.V        | 123.127      | -          | -         | 1.767 |
| gate_proj.V      | 124.015      | -          | -         | 1.090 |
| up_proj.U        | 1.766        | -          | -         | 0.503 |
| gate_proj.U      | 2.192        | -          | -         | 0.010 |
| down_proj.V      | 4.540        | -          | -         | 5.782 |
| down_proj.U      | 0.050        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 9337.791     | -          | -         | 1.814 |
| v_proj.V         | 9277.824     | -          | -         | 1.085 |
| q_proj.V         | 9313.796     | -          | -         | 1.086 |
| k_proj.U         | 6.943        | -          | -         | 0.554 |
| v_proj.U         | 4.978        | -          | -         | 0.059 |
| q_proj.U         | 21.615       | -          | -         | 0.059 |
| o_proj.V         | 185.868      | -          | -         | 1.762 |
| o_proj.U         | 1.371        | -          | -         | 0.630 |
| up_proj.V        | 8946.121     | -          | -         | 1.792 |
| gate_proj.V      | 8967.591     | -          | -         | 1.085 |
| up_proj.U        | 32.729       | -          | -         | 0.606 |
| gate_proj.U      | 33.979       | -          | -         | 0.091 |
| down_proj.V      | 352.936      | -          | -         | 5.820 |
| down_proj.U      | 1.731        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.011        | -          | -         | 1.792 |
| v_proj.V         | 0.010        | -          | -         | 1.076 |
| q_proj.V         | 0.010        | -          | -         | 1.076 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.001        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.737 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.006        | -          | -         | 1.766 |
| gate_proj.V      | 0.006        | -          | -         | 1.080 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.760 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 231.459      | -          | -         | 1.794 |
| v_proj.V         | 236.621      | -          | -         | 1.082 |
| q_proj.V         | 236.017      | -          | -         | 1.081 |
| k_proj.U         | 1.189        | -          | -         | 0.485 |
| v_proj.U         | 0.388        | -          | -         | 0.010 |
| q_proj.U         | 2.701        | -          | -         | 0.010 |
| o_proj.V         | 7.316        | -          | -         | 1.742 |
| o_proj.U         | 0.134        | -          | -         | 0.556 |
| up_proj.V        | 152.261      | -          | -         | 1.767 |
| gate_proj.V      | 151.152      | -          | -         | 1.080 |
| up_proj.U        | 1.929        | -          | -         | 0.502 |
| gate_proj.U      | 2.409        | -          | -         | 0.010 |
| down_proj.V      | 5.515        | -          | -         | 5.780 |
| down_proj.U      | 0.056        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11015.771    | -          | -         | 1.814 |
| v_proj.V         | 11043.457    | -          | -         | 1.083 |
| q_proj.V         | 11110.656    | -          | -         | 1.086 |
| k_proj.U         | 7.394        | -          | -         | 0.553 |
| v_proj.U         | 5.345        | -          | -         | 0.059 |
| q_proj.U         | 23.033       | -          | -         | 0.059 |
| o_proj.V         | 316.130      | -          | -         | 1.758 |
| o_proj.U         | 1.938        | -          | -         | 0.628 |
| up_proj.V        | 11026.908    | -          | -         | 1.789 |
| gate_proj.V      | 11059.168    | -          | -         | 1.085 |
| up_proj.U        | 40.063       | -          | -         | 0.606 |
| gate_proj.U      | 42.009       | -          | -         | 0.092 |
| down_proj.V      | 467.565      | -          | -         | 5.804 |
| down_proj.U      | 1.981        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.013        | -          | -         | 1.790 |
| v_proj.V         | 0.012        | -          | -         | 1.074 |
| q_proj.V         | 0.013        | -          | -         | 1.076 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.734 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.008        | -          | -         | 1.762 |
| gate_proj.V      | 0.008        | -          | -         | 1.076 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.769 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 282.844      | -          | -         | 1.791 |
| v_proj.V         | 290.910      | -          | -         | 1.080 |
| q_proj.V         | 291.735      | -          | -         | 1.080 |
| k_proj.U         | 1.104        | -          | -         | 0.485 |
| v_proj.U         | 0.421        | -          | -         | 0.010 |
| q_proj.U         | 2.719        | -          | -         | 0.010 |
| o_proj.V         | 7.821        | -          | -         | 1.740 |
| o_proj.U         | 0.096        | -          | -         | 0.558 |
| up_proj.V        | 193.482      | -          | -         | 1.769 |
| gate_proj.V      | 192.573      | -          | -         | 1.080 |
| up_proj.U        | 2.023        | -          | -         | 0.503 |
| gate_proj.U      | 2.522        | -          | -         | 0.010 |
| down_proj.V      | 6.056        | -          | -         | 5.778 |
| down_proj.U      | 0.045        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 13430.434    | -          | -         | 1.815 |
| v_proj.V         | 13366.104    | -          | -         | 1.082 |
| q_proj.V         | 13486.981    | -          | -         | 1.084 |
| k_proj.U         | 8.300        | -          | -         | 0.555 |
| v_proj.U         | 6.332        | -          | -         | 0.060 |
| q_proj.U         | 25.614       | -          | -         | 0.059 |
| o_proj.V         | 295.780      | -          | -         | 1.762 |
| o_proj.U         | 1.233        | -          | -         | 0.629 |
| up_proj.V        | 13971.860    | -          | -         | 1.795 |
| gate_proj.V      | 13916.953    | -          | -         | 1.091 |
| up_proj.U        | 44.072       | -          | -         | 0.605 |
| gate_proj.U      | 45.795       | -          | -         | 0.091 |
| down_proj.V      | 428.836      | -          | -         | 5.821 |
| down_proj.U      | 1.558        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.014        | -          | -         | 1.793 |
| v_proj.V         | 0.013        | -          | -         | 1.077 |
| q_proj.V         | 0.015        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.736 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.010        | -          | -         | 1.761 |
| gate_proj.V      | 0.010        | -          | -         | 1.078 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.767 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 332.980      | -          | -         | 1.792 |
| v_proj.V         | 333.662      | -          | -         | 1.075 |
| q_proj.V         | 337.765      | -          | -         | 1.079 |
| k_proj.U         | 1.286        | -          | -         | 0.485 |
| v_proj.U         | 0.487        | -          | -         | 0.010 |
| q_proj.U         | 3.346        | -          | -         | 0.010 |
| o_proj.V         | 9.760        | -          | -         | 1.734 |
| o_proj.U         | 0.086        | -          | -         | 0.556 |
| up_proj.V        | 240.848      | -          | -         | 1.762 |
| gate_proj.V      | 239.892      | -          | -         | 1.078 |
| up_proj.U        | 2.139        | -          | -         | 0.502 |
| gate_proj.U      | 2.557        | -          | -         | 0.010 |
| down_proj.V      | 7.691        | -          | -         | 5.775 |
| down_proj.U      | 0.047        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 15297.251    | -          | -         | 1.815 |
| v_proj.V         | 15179.798    | -          | -         | 1.083 |
| q_proj.V         | 15304.439    | -          | -         | 1.082 |
| k_proj.U         | 8.573        | -          | -         | 0.554 |
| v_proj.U         | 6.562        | -          | -         | 0.059 |
| q_proj.U         | 27.905       | -          | -         | 0.059 |
| o_proj.V         | 329.273      | -          | -         | 1.761 |
| o_proj.U         | 1.173        | -          | -         | 0.628 |
| up_proj.V        | 16687.363    | -          | -         | 1.790 |
| gate_proj.V      | 16749.443    | -          | -         | 1.083 |
| up_proj.U        | 48.676       | -          | -         | 0.605 |
| gate_proj.U      | 50.185       | -          | -         | 0.091 |
| down_proj.V      | 529.911      | -          | -         | 5.813 |
| down_proj.U      | 1.683        | -          | -         | 0.664 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.017        | -          | -         | 1.790 |
| v_proj.V         | 0.015        | -          | -         | 1.076 |
| q_proj.V         | 0.017        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.732 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.012        | -          | -         | 1.763 |
| gate_proj.V      | 0.012        | -          | -         | 1.079 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.763 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 387.920      | -          | -         | 1.792 |
| v_proj.V         | 388.333      | -          | -         | 1.078 |
| q_proj.V         | 392.745      | -          | -         | 1.076 |
| k_proj.U         | 1.289        | -          | -         | 0.484 |
| v_proj.U         | 0.486        | -          | -         | 0.010 |
| q_proj.U         | 3.527        | -          | -         | 0.010 |
| o_proj.V         | 6.756        | -          | -         | 1.739 |
| o_proj.U         | 0.046        | -          | -         | 0.558 |
| up_proj.V        | 291.571      | -          | -         | 1.767 |
| gate_proj.V      | 288.239      | -          | -         | 1.081 |
| up_proj.U        | 2.289        | -          | -         | 0.502 |
| gate_proj.U      | 2.699        | -          | -         | 0.010 |
| down_proj.V      | 8.935        | -          | -         | 5.780 |
| down_proj.U      | 0.046        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 17170.670    | -          | -         | 1.817 |
| v_proj.V         | 17164.227    | -          | -         | 1.082 |
| q_proj.V         | 17297.449    | -          | -         | 1.085 |
| k_proj.U         | 9.756        | -          | -         | 0.554 |
| v_proj.U         | 7.347        | -          | -         | 0.059 |
| q_proj.U         | 30.884       | -          | -         | 0.059 |
| o_proj.V         | 212.072      | -          | -         | 1.765 |
| o_proj.U         | 0.661        | -          | -         | 0.628 |
| up_proj.V        | 19503.518    | -          | -         | 1.789 |
| gate_proj.V      | 19407.188    | -          | -         | 1.086 |
| up_proj.U        | 54.950       | -          | -         | 0.606 |
| gate_proj.U      | 56.598       | -          | -         | 0.092 |
| down_proj.V      | 554.623      | -          | -         | 5.811 |
| down_proj.U      | 1.610        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.020        | -          | -         | 1.792 |
| v_proj.V         | 0.017        | -          | -         | 1.078 |
| q_proj.V         | 0.018        | -          | -         | 1.081 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.000        | -          | -         | 1.739 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.014        | -          | -         | 1.764 |
| gate_proj.V      | 0.014        | -          | -         | 1.079 |
| up_proj.U        | 0.000        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.000        | -          | -         | 5.771 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 411.220      | -          | -         | 1.794 |
| v_proj.V         | 413.628      | -          | -         | 1.080 |
| q_proj.V         | 416.520      | -          | -         | 1.084 |
| k_proj.U         | 1.261        | -          | -         | 0.485 |
| v_proj.U         | 0.446        | -          | -         | 0.010 |
| q_proj.U         | 3.294        | -          | -         | 0.010 |
| o_proj.V         | 5.741        | -          | -         | 1.741 |
| o_proj.U         | 0.100        | -          | -         | 0.556 |
| up_proj.V        | 332.801      | -          | -         | 1.768 |
| gate_proj.V      | 331.781      | -          | -         | 1.082 |
| up_proj.U        | 2.604        | -          | -         | 0.501 |
| gate_proj.U      | 3.040        | -          | -         | 0.010 |
| down_proj.V      | 10.682       | -          | -         | 5.779 |
| down_proj.U      | 0.049        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 17835.426    | -          | -         | 1.814 |
| v_proj.V         | 17818.297    | -          | -         | 1.081 |
| q_proj.V         | 17851.504    | -          | -         | 1.087 |
| k_proj.U         | 9.528        | -          | -         | 0.554 |
| v_proj.U         | 7.240        | -          | -         | 0.059 |
| q_proj.U         | 30.738       | -          | -         | 0.059 |
| o_proj.V         | 224.598      | -          | -         | 1.765 |
| o_proj.U         | 1.268        | -          | -         | 0.629 |
| up_proj.V        | 22057.252    | -          | -         | 1.794 |
| gate_proj.V      | 21924.379    | -          | -         | 1.088 |
| up_proj.U        | 60.684       | -          | -         | 0.607 |
| gate_proj.U      | 62.190       | -          | -         | 0.092 |
| down_proj.V      | 652.030      | -          | -         | 5.830 |
| down_proj.U      | 1.710        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.021        | -          | -         | 1.792 |
| v_proj.V         | 0.019        | -          | -         | 1.078 |
| q_proj.V         | 0.022        | -          | -         | 1.074 |
| k_proj.U         | 0.000        | -          | -         | 0.472 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.736 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.016        | -          | -         | 1.766 |
| gate_proj.V      | 0.016        | -          | -         | 1.077 |
| up_proj.U        | 0.000        | -          | -         | 0.495 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.766 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 476.205      | -          | -         | 1.793 |
| v_proj.V         | 484.911      | -          | -         | 1.081 |
| q_proj.V         | 487.236      | -          | -         | 1.079 |
| k_proj.U         | 1.355        | -          | -         | 0.485 |
| v_proj.U         | 0.566        | -          | -         | 0.010 |
| q_proj.U         | 3.309        | -          | -         | 0.010 |
| o_proj.V         | 8.301        | -          | -         | 1.738 |
| o_proj.U         | 0.083        | -          | -         | 0.558 |
| up_proj.V        | 389.229      | -          | -         | 1.766 |
| gate_proj.V      | 388.435      | -          | -         | 1.081 |
| up_proj.U        | 2.736        | -          | -         | 0.503 |
| gate_proj.U      | 3.162        | -          | -         | 0.010 |
| down_proj.V      | 11.966       | -          | -         | 5.771 |
| down_proj.U      | 0.045        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 20771.422    | -          | -         | 1.814 |
| v_proj.V         | 20615.773    | -          | -         | 1.089 |
| q_proj.V         | 20776.723    | -          | -         | 1.086 |
| k_proj.U         | 11.671       | -          | -         | 0.554 |
| v_proj.U         | 8.698        | -          | -         | 0.060 |
| q_proj.U         | 34.350       | -          | -         | 0.061 |
| o_proj.V         | 309.657      | -          | -         | 1.762 |
| o_proj.U         | 1.317        | -          | -         | 0.628 |
| up_proj.V        | 25347.018    | -          | -         | 1.790 |
| gate_proj.V      | 25221.299    | -          | -         | 1.089 |
| up_proj.U        | 67.840       | -          | -         | 0.607 |
| gate_proj.U      | 68.889       | -          | -         | 0.092 |
| down_proj.V      | 727.170      | -          | -         | 5.813 |
| down_proj.U      | 1.734        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.024        | -          | -         | 1.791 |
| v_proj.V         | 0.021        | -          | -         | 1.077 |
| q_proj.V         | 0.024        | -          | -         | 1.077 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.739 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.018        | -          | -         | 1.765 |
| gate_proj.V      | 0.019        | -          | -         | 1.077 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.769 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 539.083      | -          | -         | 1.793 |
| v_proj.V         | 534.516      | -          | -         | 1.080 |
| q_proj.V         | 537.329      | -          | -         | 1.083 |
| k_proj.U         | 1.593        | -          | -         | 0.485 |
| v_proj.U         | 0.637        | -          | -         | 0.010 |
| q_proj.U         | 3.904        | -          | -         | 0.010 |
| o_proj.V         | 10.381       | -          | -         | 1.741 |
| o_proj.U         | 0.064        | -          | -         | 0.557 |
| up_proj.V        | 442.722      | -          | -         | 1.766 |
| gate_proj.V      | 446.875      | -          | -         | 1.081 |
| up_proj.U        | 2.989        | -          | -         | 0.501 |
| gate_proj.U      | 3.377        | -          | -         | 0.010 |
| down_proj.V      | 13.845       | -          | -         | 5.773 |
| down_proj.U      | 0.049        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 22886.143    | -          | -         | 1.813 |
| v_proj.V         | 22773.635    | -          | -         | 1.088 |
| q_proj.V         | 22901.309    | -          | -         | 1.084 |
| k_proj.U         | 10.648       | -          | -         | 0.553 |
| v_proj.U         | 9.334        | -          | -         | 0.059 |
| q_proj.U         | 35.423       | -          | -         | 0.059 |
| o_proj.V         | 377.947      | -          | -         | 1.761 |
| o_proj.U         | 1.259        | -          | -         | 0.629 |
| up_proj.V        | 28592.217    | -          | -         | 1.795 |
| gate_proj.V      | 28688.227    | -          | -         | 1.088 |
| up_proj.U        | 73.142       | -          | -         | 0.607 |
| gate_proj.U      | 74.261       | -          | -         | 0.092 |
| down_proj.V      | 817.061      | -          | -         | 5.834 |
| down_proj.U      | 1.863        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.026        | -          | -         | 1.791 |
| v_proj.V         | 0.022        | -          | -         | 1.077 |
| q_proj.V         | 0.026        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.470 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.003        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.736 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.021        | -          | -         | 1.763 |
| gate_proj.V      | 0.022        | -          | -         | 1.073 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.000        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.749 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 577.499      | -          | -         | 1.791 |
| v_proj.V         | 582.432      | -          | -         | 1.078 |
| q_proj.V         | 587.236      | -          | -         | 1.080 |
| k_proj.U         | 1.458        | -          | -         | 0.485 |
| v_proj.U         | 0.672        | -          | -         | 0.010 |
| q_proj.U         | 3.785        | -          | -         | 0.010 |
| o_proj.V         | 9.149        | -          | -         | 1.736 |
| o_proj.U         | 0.137        | -          | -         | 0.557 |
| up_proj.V        | 506.924      | -          | -         | 1.765 |
| gate_proj.V      | 511.425      | -          | -         | 1.080 |
| up_proj.U        | 3.249        | -          | -         | 0.502 |
| gate_proj.U      | 3.588        | -          | -         | 0.010 |
| down_proj.V      | 16.044       | -          | -         | 5.776 |
| down_proj.U      | 0.055        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 24357.732    | -          | -         | 1.815 |
| v_proj.V         | 24465.121    | -          | -         | 1.082 |
| q_proj.V         | 24473.281    | -          | -         | 1.090 |
| k_proj.U         | 11.148       | -          | -         | 0.554 |
| v_proj.U         | 9.344        | -          | -         | 0.059 |
| q_proj.U         | 36.478       | -          | -         | 0.059 |
| o_proj.V         | 251.724      | -          | -         | 1.762 |
| o_proj.U         | 0.959        | -          | -         | 0.629 |
| up_proj.V        | 31864.480    | -          | -         | 1.793 |
| gate_proj.V      | 31809.619    | -          | -         | 1.087 |
| up_proj.U        | 77.778       | -          | -         | 0.607 |
| gate_proj.U      | 78.460       | -          | -         | 0.091 |
| down_proj.V      | 946.142      | -          | -         | 5.815 |
| down_proj.U      | 2.090        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.030        | -          | -         | 1.790 |
| v_proj.V         | 0.026        | -          | -         | 1.078 |
| q_proj.V         | 0.028        | -          | -         | 1.080 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.732 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.023        | -          | -         | 1.763 |
| gate_proj.V      | 0.024        | -          | -         | 1.078 |
| up_proj.U        | 0.002        | -          | -         | 0.494 |
| gate_proj.U      | 0.002        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.767 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 628.341      | -          | -         | 1.792 |
| v_proj.V         | 624.226      | -          | -         | 1.077 |
| q_proj.V         | 630.803      | -          | -         | 1.079 |
| k_proj.U         | 1.717        | -          | -         | 0.484 |
| v_proj.U         | 0.703        | -          | -         | 0.010 |
| q_proj.U         | 4.554        | -          | -         | 0.010 |
| o_proj.V         | 12.352       | -          | -         | 1.738 |
| o_proj.U         | 0.161        | -          | -         | 0.557 |
| up_proj.V        | 557.063      | -          | -         | 1.766 |
| gate_proj.V      | 560.249      | -          | -         | 1.080 |
| up_proj.U        | 3.414        | -          | -         | 0.501 |
| gate_proj.U      | 3.759        | -          | -         | 0.010 |
| down_proj.V      | 18.346       | -          | -         | 5.771 |
| down_proj.U      | 0.064        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 25765.439    | -          | -         | 1.814 |
| v_proj.V         | 25873.303    | -          | -         | 1.082 |
| q_proj.V         | 26081.611    | -          | -         | 1.086 |
| k_proj.U         | 10.756       | -          | -         | 0.554 |
| v_proj.U         | 9.684        | -          | -         | 0.060 |
| q_proj.U         | 36.052       | -          | -         | 0.059 |
| o_proj.V         | 423.395      | -          | -         | 1.764 |
| o_proj.U         | 1.890        | -          | -         | 0.629 |
| up_proj.V        | 34502.910    | -          | -         | 1.789 |
| gate_proj.V      | 34376.254    | -          | -         | 1.086 |
| up_proj.U        | 82.431       | -          | -         | 0.606 |
| gate_proj.U      | 82.907       | -          | -         | 0.091 |
| down_proj.V      | 1067.048     | -          | -         | 5.826 |
| down_proj.U      | 2.357        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.028        | -          | -         | 1.793 |
| v_proj.V         | 0.025        | -          | -         | 1.079 |
| q_proj.V         | 0.027        | -          | -         | 1.080 |
| k_proj.U         | 0.001        | -          | -         | 0.470 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.001        | -          | -         | 1.736 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.026        | -          | -         | 1.762 |
| gate_proj.V      | 0.027        | -          | -         | 1.078 |
| up_proj.U        | 0.002        | -          | -         | 0.495 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.776 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 589.957      | -          | -         | 1.788 |
| v_proj.V         | 592.643      | -          | -         | 1.079 |
| q_proj.V         | 593.363      | -          | -         | 1.079 |
| k_proj.U         | 1.638        | -          | -         | 0.484 |
| v_proj.U         | 0.771        | -          | -         | 0.010 |
| q_proj.U         | 4.461        | -          | -         | 0.010 |
| o_proj.V         | 13.693       | -          | -         | 1.737 |
| o_proj.U         | 0.155        | -          | -         | 0.556 |
| up_proj.V        | 615.788      | -          | -         | 1.766 |
| gate_proj.V      | 624.827      | -          | -         | 1.079 |
| up_proj.U        | 3.829        | -          | -         | 0.501 |
| gate_proj.U      | 4.159        | -          | -         | 0.010 |
| down_proj.V      | 22.564       | -          | -         | 5.775 |
| down_proj.U      | 0.087        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 24202.785    | -          | -         | 1.818 |
| v_proj.V         | 24010.957    | -          | -         | 1.085 |
| q_proj.V         | 24199.049    | -          | -         | 1.088 |
| k_proj.U         | 10.115       | -          | -         | 0.555 |
| v_proj.U         | 8.412        | -          | -         | 0.059 |
| q_proj.U         | 36.916       | -          | -         | 0.059 |
| o_proj.V         | 438.362      | -          | -         | 1.766 |
| o_proj.U         | 1.727        | -          | -         | 0.628 |
| up_proj.V        | 38003.059    | -          | -         | 1.792 |
| gate_proj.V      | 37876.930    | -          | -         | 1.089 |
| up_proj.U        | 87.474       | -          | -         | 0.606 |
| gate_proj.U      | 88.295       | -          | -         | 0.091 |
| down_proj.V      | 1281.472     | -          | -         | 5.826 |
| down_proj.U      | 2.813        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.029        | -          | -         | 1.785 |
| v_proj.V         | 0.029        | -          | -         | 1.074 |
| q_proj.V         | 0.029        | -          | -         | 1.077 |
| k_proj.U         | 0.001        | -          | -         | 0.471 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.002        | -          | -         | 1.734 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.028        | -          | -         | 1.765 |
| gate_proj.V      | 0.029        | -          | -         | 1.076 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.001        | -          | -         | 0.002 |
| down_proj.V      | 0.001        | -          | -         | 5.773 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 668.563      | -          | -         | 1.791 |
| v_proj.V         | 677.410      | -          | -         | 1.079 |
| q_proj.V         | 674.241      | -          | -         | 1.082 |
| k_proj.U         | 1.718        | -          | -         | 0.485 |
| v_proj.U         | 0.793        | -          | -         | 0.010 |
| q_proj.U         | 5.270        | -          | -         | 0.010 |
| o_proj.V         | 18.505       | -          | -         | 1.742 |
| o_proj.U         | 0.281        | -          | -         | 0.557 |
| up_proj.V        | 665.706      | -          | -         | 1.762 |
| gate_proj.V      | 675.832      | -          | -         | 1.079 |
| up_proj.U        | 4.415        | -          | -         | 0.502 |
| gate_proj.U      | 4.796        | -          | -         | 0.010 |
| down_proj.V      | 26.667       | -          | -         | 5.780 |
| down_proj.U      | 0.142        | -          | -         | 0.561 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 27430.609    | -          | -         | 1.817 |
| v_proj.V         | 27473.205    | -          | -         | 1.086 |
| q_proj.V         | 27593.074    | -          | -         | 1.086 |
| k_proj.U         | 11.369       | -          | -         | 0.553 |
| v_proj.U         | 9.640        | -          | -         | 0.059 |
| q_proj.U         | 37.953       | -          | -         | 0.059 |
| o_proj.V         | 605.373      | -          | -         | 1.764 |
| o_proj.U         | 2.808        | -          | -         | 0.629 |
| up_proj.V        | 40617.133    | -          | -         | 1.791 |
| gate_proj.V      | 40605.836    | -          | -         | 1.084 |
| up_proj.U        | 90.914       | -          | -         | 0.606 |
| gate_proj.U      | 91.135       | -          | -         | 0.091 |
| down_proj.V      | 1488.104     | -          | -         | 5.821 |
| down_proj.U      | 3.395        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.027        | -          | -         | 1.787 |
| v_proj.V         | 0.024        | -          | -         | 1.072 |
| q_proj.V         | 0.028        | -          | -         | 1.075 |
| k_proj.U         | 0.000        | -          | -         | 0.471 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.002        | -          | -         | 1.734 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.029        | -          | -         | 1.763 |
| gate_proj.V      | 0.030        | -          | -         | 1.074 |
| up_proj.U        | 0.001        | -          | -         | 0.494 |
| gate_proj.U      | 0.007        | -          | -         | 0.002 |
| down_proj.V      | 0.002        | -          | -         | 5.755 |
| down_proj.U      | 0.000        | -          | -         | 0.553 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 630.593      | -          | -         | 1.791 |
| v_proj.V         | 650.666      | -          | -         | 1.077 |
| q_proj.V         | 640.465      | -          | -         | 1.079 |
| k_proj.U         | 1.570        | -          | -         | 0.484 |
| v_proj.U         | 0.597        | -          | -         | 0.010 |
| q_proj.U         | 4.168        | -          | -         | 0.010 |
| o_proj.V         | 28.210       | -          | -         | 1.735 |
| o_proj.U         | 0.304        | -          | -         | 0.557 |
| up_proj.V        | 688.857      | -          | -         | 1.766 |
| gate_proj.V      | 695.810      | -          | -         | 1.077 |
| up_proj.U        | 5.022        | -          | -         | 0.502 |
| gate_proj.U      | 5.462        | -          | -         | 0.010 |
| down_proj.V      | 35.445       | -          | -         | 5.772 |
| down_proj.U      | 0.234        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 24956.957    | -          | -         | 1.811 |
| v_proj.V         | 24863.277    | -          | -         | 1.083 |
| q_proj.V         | 25116.234    | -          | -         | 1.084 |
| k_proj.U         | 9.772        | -          | -         | 0.554 |
| v_proj.U         | 8.480        | -          | -         | 0.059 |
| q_proj.U         | 34.375       | -          | -         | 0.059 |
| o_proj.V         | 1144.989     | -          | -         | 1.759 |
| o_proj.U         | 5.062        | -          | -         | 0.629 |
| up_proj.V        | 41044.969    | -          | -         | 1.794 |
| gate_proj.V      | 40995.898    | -          | -         | 1.087 |
| up_proj.U        | 96.434       | -          | -         | 0.606 |
| gate_proj.U      | 97.090       | -          | -         | 0.091 |
| down_proj.V      | 1920.359     | -          | -         | 5.824 |
| down_proj.U      | 4.466        | -          | -         | 0.666 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.027        | -          | -         | 1.789 |
| v_proj.V         | 0.024        | -          | -         | 1.075 |
| q_proj.V         | 0.026        | -          | -         | 1.077 |
| k_proj.U         | 0.001        | -          | -         | 0.471 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.004        | -          | -         | 0.002 |
| o_proj.V         | 0.002        | -          | -         | 1.734 |
| o_proj.U         | 0.000        | -          | -         | 0.548 |
| up_proj.V        | 0.027        | -          | -         | 1.764 |
| gate_proj.V      | 0.027        | -          | -         | 1.075 |
| up_proj.U        | 0.002        | -          | -         | 0.494 |
| gate_proj.U      | 0.009        | -          | -         | 0.002 |
| down_proj.V      | 0.002        | -          | -         | 5.760 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 619.335      | -          | -         | 1.791 |
| v_proj.V         | 645.712      | -          | -         | 1.076 |
| q_proj.V         | 639.272      | -          | -         | 1.077 |
| k_proj.U         | 1.923        | -          | -         | 0.484 |
| v_proj.U         | 0.804        | -          | -         | 0.010 |
| q_proj.U         | 5.892        | -          | -         | 0.011 |
| o_proj.V         | 36.223       | -          | -         | 1.737 |
| o_proj.U         | 0.637        | -          | -         | 0.556 |
| up_proj.V        | 645.597      | -          | -         | 1.768 |
| gate_proj.V      | 648.386      | -          | -         | 1.078 |
| up_proj.U        | 5.956        | -          | -         | 0.502 |
| gate_proj.U      | 6.604        | -          | -         | 0.010 |
| down_proj.V      | 42.547       | -          | -         | 5.776 |
| down_proj.U      | 0.414        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 25037.959    | -          | -         | 1.811 |
| v_proj.V         | 25286.014    | -          | -         | 1.080 |
| q_proj.V         | 25132.473    | -          | -         | 1.082 |
| k_proj.U         | 10.063       | -          | -         | 0.553 |
| v_proj.U         | 8.009        | -          | -         | 0.059 |
| q_proj.U         | 36.509       | -          | -         | 0.059 |
| o_proj.V         | 1502.182     | -          | -         | 1.757 |
| o_proj.U         | 8.975        | -          | -         | 0.627 |
| up_proj.V        | 37998.180    | -          | -         | 1.789 |
| gate_proj.V      | 38141.031    | -          | -         | 1.084 |
| up_proj.U        | 100.110      | -          | -         | 0.606 |
| gate_proj.U      | 104.029      | -          | -         | 0.091 |
| down_proj.V      | 2251.486     | -          | -         | 5.803 |
| down_proj.U      | 5.610        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.022        | -          | -         | 1.794 |
| v_proj.V         | 0.021        | -          | -         | 1.076 |
| q_proj.V         | 0.022        | -          | -         | 1.074 |
| k_proj.U         | 0.001        | -          | -         | 0.470 |
| v_proj.U         | 0.000        | -          | -         | 0.002 |
| q_proj.U         | 0.002        | -          | -         | 0.002 |
| o_proj.V         | 0.004        | -          | -         | 1.737 |
| o_proj.U         | 0.001        | -          | -         | 0.547 |
| up_proj.V        | 0.024        | -          | -         | 1.764 |
| gate_proj.V      | 0.024        | -          | -         | 1.075 |
| up_proj.U        | 0.001        | -          | -         | 0.493 |
| gate_proj.U      | 0.007        | -          | -         | 0.002 |
| down_proj.V      | 0.004        | -          | -         | 5.769 |
| down_proj.U      | 0.001        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 511.776      | -          | -         | 1.801 |
| v_proj.V         | 535.086      | -          | -         | 1.079 |
| q_proj.V         | 516.255      | -          | -         | 1.092 |
| k_proj.U         | 1.781        | -          | -         | 0.485 |
| v_proj.U         | 0.777        | -          | -         | 0.010 |
| q_proj.U         | 4.881        | -          | -         | 0.010 |
| o_proj.V         | 66.417       | -          | -         | 1.740 |
| o_proj.U         | 1.566        | -          | -         | 0.558 |
| up_proj.V        | 581.297      | -          | -         | 1.766 |
| gate_proj.V      | 584.447      | -          | -         | 1.077 |
| up_proj.U        | 7.149        | -          | -         | 0.502 |
| gate_proj.U      | 8.066        | -          | -         | 0.010 |
| down_proj.V      | 69.522       | -          | -         | 5.773 |
| down_proj.U      | 1.145        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 19694.449    | -          | -         | 1.813 |
| v_proj.V         | 19833.656    | -          | -         | 1.080 |
| q_proj.V         | 19834.359    | -          | -         | 1.081 |
| k_proj.U         | 9.296        | -          | -         | 0.553 |
| v_proj.U         | 7.537        | -          | -         | 0.059 |
| q_proj.U         | 33.247       | -          | -         | 0.059 |
| o_proj.V         | 3158.192     | -          | -         | 1.762 |
| o_proj.U         | 18.480       | -          | -         | 0.627 |
| up_proj.V        | 32271.688    | -          | -         | 1.791 |
| gate_proj.V      | 32187.738    | -          | -         | 1.086 |
| up_proj.U        | 85.335       | -          | -         | 0.605 |
| gate_proj.U      | 89.495       | -          | -         | 0.091 |
| down_proj.V      | 2928.192     | -          | -         | 5.811 |
| down_proj.U      | 8.901        | -          | -         | 0.665 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.012        | -          | -         | 1.791 |
| v_proj.V         | 0.011        | -          | -         | 1.073 |
| q_proj.V         | 0.012        | -          | -         | 1.075 |
| k_proj.U         | 0.003        | -          | -         | 0.471 |
| v_proj.U         | 0.001        | -          | -         | 0.002 |
| q_proj.U         | 0.012        | -          | -         | 0.002 |
| o_proj.V         | 0.004        | -          | -         | 1.735 |
| o_proj.U         | 0.001        | -          | -         | 0.548 |
| up_proj.V        | 0.014        | -          | -         | 1.763 |
| gate_proj.V      | 0.014        | -          | -         | 1.082 |
| up_proj.U        | 0.002        | -          | -         | 0.494 |
| gate_proj.U      | 0.003        | -          | -         | 0.002 |
| down_proj.V      | 0.009        | -          | -         | 5.770 |
| down_proj.U      | 0.000        | -          | -         | 0.554 |
+------------------+--------------+------------+-----------+-------+


Quantizing 3bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 304.581      | -          | -         | 1.787 |
| v_proj.V         | 306.729      | -          | -         | 1.077 |
| q_proj.V         | 303.173      | -          | -         | 1.081 |
| k_proj.U         | 1.277        | -          | -         | 0.485 |
| v_proj.U         | 0.498        | -          | -         | 0.010 |
| q_proj.U         | 3.789        | -          | -         | 0.010 |
| o_proj.V         | 14.392       | -          | -         | 1.736 |
| o_proj.U         | 0.834        | -          | -         | 0.557 |
| up_proj.V        | 366.094      | -          | -         | 1.761 |
| gate_proj.V      | 365.233      | -          | -         | 1.077 |
| up_proj.U        | 5.196        | -          | -         | 0.502 |
| gate_proj.U      | 5.920        | -          | -         | 0.010 |
| down_proj.V      | 82.069       | -          | -         | 5.781 |
| down_proj.U      | 1.954        | -          | -         | 0.560 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 12559.769    | -          | -         | 1.817 |
| v_proj.V         | 12791.178    | -          | -         | 1.086 |
| q_proj.V         | 12749.983    | -          | -         | 1.085 |
| k_proj.U         | 5.331        | -          | -         | 0.553 |
| v_proj.U         | 4.873        | -          | -         | 0.059 |
| q_proj.U         | 20.986       | -          | -         | 0.059 |
| o_proj.V         | 499.126      | -          | -         | 1.761 |
| o_proj.U         | 5.680        | -          | -         | 0.627 |
| up_proj.V        | 22547.500    | -          | -         | 1.792 |
| gate_proj.V      | 22587.646    | -          | -         | 1.087 |
| up_proj.U        | 69.882       | -          | -         | 0.605 |
| gate_proj.U      | 75.339       | -          | -         | 0.091 |
| down_proj.V      | 3324.444     | -          | -         | 5.819 |
| down_proj.U      | 9.671        | -          | -         | 0.663 |
+------------------+--------------+------------+-----------+-------+


8386.60077548027
