factor 8.0
Unrecognized keys in `rope_scaling` for 'rope_type'='linear': {'type'}
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]Loading checkpoint shards:  50%|█████     | 2/4 [00:00<00:00, 13.18it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.48it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.26it/s]
Token indices sequence length is longer than the specified maximum sequence length for this model (3259 > 2048). Running this sequence through the model will result in indexing errors
Starting ...
Ready.
Quantizing 8bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.498 |
| v_proj.V         | 0.005        | -          | -         | 1.115 |
| q_proj.V         | 0.007        | -          | -         | 1.125 |
| k_proj.U         | 0.000        | -          | -         | 0.154 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.330 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.005        | -          | -         | 1.355 |
| gate_proj.V      | 0.005        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.170 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.937 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6188.034     | -          | -         | 1.431 |
| v_proj.V         | 3666.121     | -          | -         | 1.121 |
| q_proj.V         | 7127.865     | -          | -         | 1.121 |
| k_proj.U         | 0.021        | -          | -         | 0.479 |
| v_proj.U         | 0.021        | -          | -         | 0.265 |
| q_proj.U         | 0.107        | -          | -         | 0.266 |
| o_proj.V         | 24.969       | -          | -         | 1.399 |
| o_proj.U         | 0.002        | -          | -         | 0.505 |
| up_proj.V        | 3073.927     | -          | -         | 1.424 |
| gate_proj.V      | 3072.510     | -          | -         | 1.121 |
| up_proj.U        | 0.523        | -          | -         | 0.618 |
| gate_proj.U      | 0.516        | -          | -         | 0.385 |
| down_proj.V      | 18.845       | -          | -         | 5.046 |
| down_proj.U      | 0.004        | -          | -         | 0.637 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.377 |
| v_proj.V         | 0.004        | -          | -         | 1.118 |
| q_proj.V         | 0.004        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.154 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.007        | -          | -         | 1.365 |
| gate_proj.V      | 0.007        | -          | -         | 1.118 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.965 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4272.097     | -          | -         | 1.449 |
| v_proj.V         | 2882.887     | -          | -         | 1.118 |
| q_proj.V         | 3778.786     | -          | -         | 1.138 |
| k_proj.U         | 0.031        | -          | -         | 0.474 |
| v_proj.U         | 0.044        | -          | -         | 0.267 |
| q_proj.U         | 0.165        | -          | -         | 0.268 |
| o_proj.V         | 24.556       | -          | -         | 1.399 |
| o_proj.U         | 0.003        | -          | -         | 0.508 |
| up_proj.V        | 5474.607     | -          | -         | 1.436 |
| gate_proj.V      | 5468.912     | -          | -         | 1.119 |
| up_proj.U        | 0.907        | -          | -         | 0.623 |
| gate_proj.U      | 0.901        | -          | -         | 0.386 |
| down_proj.V      | 27.739       | -          | -         | 5.071 |
| down_proj.U      | 0.003        | -          | -         | 0.644 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.031        | -          | -         | 1.389 |
| v_proj.V         | 0.031        | -          | -         | 1.112 |
| q_proj.V         | 0.033        | -          | -         | 1.115 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.008        | -          | -         | 1.365 |
| gate_proj.V      | 0.008        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.974 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11737.740    | -          | -         | 1.432 |
| v_proj.V         | 11067.079    | -          | -         | 1.115 |
| q_proj.V         | 11292.494    | -          | -         | 1.114 |
| k_proj.U         | 0.172        | -          | -         | 0.474 |
| v_proj.U         | 0.161        | -          | -         | 0.267 |
| q_proj.U         | 0.695        | -          | -         | 0.268 |
| o_proj.V         | 205.621      | -          | -         | 1.403 |
| o_proj.U         | 0.024        | -          | -         | 0.509 |
| up_proj.V        | 3468.541     | -          | -         | 1.436 |
| gate_proj.V      | 3465.073     | -          | -         | 1.128 |
| up_proj.U        | 0.548        | -          | -         | 0.621 |
| gate_proj.U      | 0.563        | -          | -         | 0.386 |
| down_proj.V      | 13.204       | -          | -         | 5.060 |
| down_proj.U      | 0.003        | -          | -         | 0.638 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.016        | -          | -         | 1.375 |
| v_proj.V         | 0.015        | -          | -         | 1.108 |
| q_proj.V         | 0.016        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.340 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.005        | -          | -         | 1.366 |
| gate_proj.V      | 0.005        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.974 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6349.066     | -          | -         | 1.438 |
| v_proj.V         | 6376.286     | -          | -         | 1.131 |
| q_proj.V         | 6336.108     | -          | -         | 1.124 |
| k_proj.U         | 0.101        | -          | -         | 0.475 |
| v_proj.U         | 0.099        | -          | -         | 0.266 |
| q_proj.U         | 0.396        | -          | -         | 0.268 |
| o_proj.V         | 247.564      | -          | -         | 1.402 |
| o_proj.U         | 0.037        | -          | -         | 0.509 |
| up_proj.V        | 2752.564     | -          | -         | 1.430 |
| gate_proj.V      | 2742.518     | -          | -         | 1.120 |
| up_proj.U        | 0.442        | -          | -         | 0.624 |
| gate_proj.U      | 0.444        | -          | -         | 0.389 |
| down_proj.V      | 13.266       | -          | -         | 5.039 |
| down_proj.U      | 0.004        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.377 |
| v_proj.V         | 0.007        | -          | -         | 1.110 |
| q_proj.V         | 0.007        | -          | -         | 1.119 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.336 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.004        | -          | -         | 1.361 |
| gate_proj.V      | 0.004        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.961 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3553.645     | -          | -         | 1.456 |
| v_proj.V         | 3552.753     | -          | -         | 1.121 |
| q_proj.V         | 3538.002     | -          | -         | 1.138 |
| k_proj.U         | 0.057        | -          | -         | 0.477 |
| v_proj.U         | 0.054        | -          | -         | 0.269 |
| q_proj.U         | 0.223        | -          | -         | 0.269 |
| o_proj.V         | 245.630      | -          | -         | 1.398 |
| o_proj.U         | 0.032        | -          | -         | 0.510 |
| up_proj.V        | 2324.598     | -          | -         | 1.425 |
| gate_proj.V      | 2312.335     | -          | -         | 1.113 |
| up_proj.U        | 0.380        | -          | -         | 0.622 |
| gate_proj.U      | 0.387        | -          | -         | 0.387 |
| down_proj.V      | 16.813       | -          | -         | 5.065 |
| down_proj.U      | 0.005        | -          | -         | 0.637 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.380 |
| v_proj.V         | 0.007        | -          | -         | 1.117 |
| q_proj.V         | 0.008        | -          | -         | 1.119 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.003        | -          | -         | 1.364 |
| gate_proj.V      | 0.003        | -          | -         | 1.126 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.981 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3747.695     | -          | -         | 1.437 |
| v_proj.V         | 3770.092     | -          | -         | 1.121 |
| q_proj.V         | 3754.990     | -          | -         | 1.123 |
| k_proj.U         | 0.074        | -          | -         | 0.476 |
| v_proj.U         | 0.071        | -          | -         | 0.269 |
| q_proj.U         | 0.269        | -          | -         | 0.268 |
| o_proj.V         | 227.845      | -          | -         | 1.401 |
| o_proj.U         | 0.038        | -          | -         | 0.510 |
| up_proj.V        | 2211.120     | -          | -         | 1.440 |
| gate_proj.V      | 2199.591     | -          | -         | 1.128 |
| up_proj.U        | 0.363        | -          | -         | 0.623 |
| gate_proj.U      | 0.364        | -          | -         | 0.384 |
| down_proj.V      | 19.256       | -          | -         | 5.043 |
| down_proj.U      | 0.006        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.380 |
| v_proj.V         | 0.006        | -          | -         | 1.112 |
| q_proj.V         | 0.006        | -          | -         | 1.118 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.337 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.003        | -          | -         | 1.363 |
| gate_proj.V      | 0.003        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.960 |
| down_proj.U      | 0.000        | -          | -         | 0.189 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2936.017     | -          | -         | 1.436 |
| v_proj.V         | 2976.716     | -          | -         | 1.116 |
| q_proj.V         | 2942.328     | -          | -         | 1.123 |
| k_proj.U         | 0.059        | -          | -         | 0.475 |
| v_proj.U         | 0.053        | -          | -         | 0.268 |
| q_proj.U         | 0.224        | -          | -         | 0.269 |
| o_proj.V         | 257.646      | -          | -         | 1.404 |
| o_proj.U         | 0.050        | -          | -         | 0.509 |
| up_proj.V        | 2065.410     | -          | -         | 1.430 |
| gate_proj.V      | 2055.071     | -          | -         | 1.123 |
| up_proj.U        | 0.354        | -          | -         | 0.623 |
| gate_proj.U      | 0.366        | -          | -         | 0.390 |
| down_proj.V      | 20.746       | -          | -         | 5.059 |
| down_proj.U      | 0.007        | -          | -         | 0.638 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.376 |
| v_proj.V         | 0.005        | -          | -         | 1.116 |
| q_proj.V         | 0.005        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.342 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.003        | -          | -         | 1.363 |
| gate_proj.V      | 0.003        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.172 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.956 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2714.506     | -          | -         | 1.437 |
| v_proj.V         | 2757.660     | -          | -         | 1.120 |
| q_proj.V         | 2723.721     | -          | -         | 1.124 |
| k_proj.U         | 0.056        | -          | -         | 0.474 |
| v_proj.U         | 0.046        | -          | -         | 0.280 |
| q_proj.U         | 0.203        | -          | -         | 0.269 |
| o_proj.V         | 248.367      | -          | -         | 1.400 |
| o_proj.U         | 0.048        | -          | -         | 0.509 |
| up_proj.V        | 1847.768     | -          | -         | 1.430 |
| gate_proj.V      | 1832.984     | -          | -         | 1.125 |
| up_proj.U        | 0.321        | -          | -         | 0.622 |
| gate_proj.U      | 0.336        | -          | -         | 0.388 |
| down_proj.V      | 20.388       | -          | -         | 5.063 |
| down_proj.U      | 0.007        | -          | -         | 0.649 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.378 |
| v_proj.V         | 0.006        | -          | -         | 1.117 |
| q_proj.V         | 0.006        | -          | -         | 1.118 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.336 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.002        | -          | -         | 1.360 |
| gate_proj.V      | 0.002        | -          | -         | 1.128 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.953 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2906.950     | -          | -         | 1.434 |
| v_proj.V         | 2930.408     | -          | -         | 1.116 |
| q_proj.V         | 2915.271     | -          | -         | 1.119 |
| k_proj.U         | 0.068        | -          | -         | 0.475 |
| v_proj.U         | 0.055        | -          | -         | 0.267 |
| q_proj.U         | 0.244        | -          | -         | 0.267 |
| o_proj.V         | 261.858      | -          | -         | 1.397 |
| o_proj.U         | 0.048        | -          | -         | 0.510 |
| up_proj.V        | 1746.337     | -          | -         | 1.429 |
| gate_proj.V      | 1742.017     | -          | -         | 1.122 |
| up_proj.U        | 0.292        | -          | -         | 0.621 |
| gate_proj.U      | 0.315        | -          | -         | 0.386 |
| down_proj.V      | 17.449       | -          | -         | 5.058 |
| down_proj.U      | 0.007        | -          | -         | 0.637 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.379 |
| v_proj.V         | 0.005        | -          | -         | 1.125 |
| q_proj.V         | 0.005        | -          | -         | 1.115 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.002        | -          | -         | 1.365 |
| gate_proj.V      | 0.002        | -          | -         | 1.117 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.946 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2741.345     | -          | -         | 1.442 |
| v_proj.V         | 2765.404     | -          | -         | 1.123 |
| q_proj.V         | 2756.744     | -          | -         | 1.121 |
| k_proj.U         | 0.064        | -          | -         | 0.477 |
| v_proj.U         | 0.048        | -          | -         | 0.269 |
| q_proj.U         | 0.216        | -          | -         | 0.279 |
| o_proj.V         | 278.056      | -          | -         | 1.400 |
| o_proj.U         | 0.056        | -          | -         | 0.509 |
| up_proj.V        | 1617.496     | -          | -         | 1.427 |
| gate_proj.V      | 1610.901     | -          | -         | 1.121 |
| up_proj.U        | 0.269        | -          | -         | 0.622 |
| gate_proj.U      | 0.286        | -          | -         | 0.386 |
| down_proj.V      | 18.035       | -          | -         | 5.055 |
| down_proj.U      | 0.006        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.378 |
| v_proj.V         | 0.006        | -          | -         | 1.112 |
| q_proj.V         | 0.006        | -          | -         | 1.112 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.002        | -          | -         | 1.360 |
| gate_proj.V      | 0.002        | -          | -         | 1.115 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.951 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3187.737     | -          | -         | 1.439 |
| v_proj.V         | 3212.862     | -          | -         | 1.119 |
| q_proj.V         | 3192.837     | -          | -         | 1.126 |
| k_proj.U         | 0.076        | -          | -         | 0.475 |
| v_proj.U         | 0.054        | -          | -         | 0.267 |
| q_proj.U         | 0.262        | -          | -         | 0.269 |
| o_proj.V         | 286.756      | -          | -         | 1.403 |
| o_proj.U         | 0.050        | -          | -         | 0.510 |
| up_proj.V        | 1580.659     | -          | -         | 1.430 |
| gate_proj.V      | 1578.281     | -          | -         | 1.119 |
| up_proj.U        | 0.266        | -          | -         | 0.622 |
| gate_proj.U      | 0.278        | -          | -         | 0.385 |
| down_proj.V      | 14.778       | -          | -         | 5.051 |
| down_proj.U      | 0.005        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.381 |
| v_proj.V         | 0.006        | -          | -         | 1.115 |
| q_proj.V         | 0.006        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.002        | -          | -         | 1.362 |
| gate_proj.V      | 0.002        | -          | -         | 1.110 |
| up_proj.U        | 0.000        | -          | -         | 0.172 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.972 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2959.831     | -          | -         | 1.441 |
| v_proj.V         | 2997.627     | -          | -         | 1.118 |
| q_proj.V         | 2980.135     | -          | -         | 1.120 |
| k_proj.U         | 0.068        | -          | -         | 0.475 |
| v_proj.U         | 0.047        | -          | -         | 0.267 |
| q_proj.U         | 0.243        | -          | -         | 0.268 |
| o_proj.V         | 249.780      | -          | -         | 1.404 |
| o_proj.U         | 0.055        | -          | -         | 0.509 |
| up_proj.V        | 1555.962     | -          | -         | 1.429 |
| gate_proj.V      | 1548.100     | -          | -         | 1.122 |
| up_proj.U        | 0.257        | -          | -         | 0.621 |
| gate_proj.U      | 0.275        | -          | -         | 0.388 |
| down_proj.V      | 15.965       | -          | -         | 5.044 |
| down_proj.U      | 0.006        | -          | -         | 0.638 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.379 |
| v_proj.V         | 0.005        | -          | -         | 1.113 |
| q_proj.V         | 0.005        | -          | -         | 1.111 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.000        | -          | -         | 1.337 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.002        | -          | -         | 1.372 |
| gate_proj.V      | 0.002        | -          | -         | 1.114 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.963 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2395.124     | -          | -         | 1.443 |
| v_proj.V         | 2447.515     | -          | -         | 1.121 |
| q_proj.V         | 2416.388     | -          | -         | 1.118 |
| k_proj.U         | 0.055        | -          | -         | 0.474 |
| v_proj.U         | 0.039        | -          | -         | 0.268 |
| q_proj.U         | 0.183        | -          | -         | 0.269 |
| o_proj.V         | 301.851      | -          | -         | 1.400 |
| o_proj.U         | 0.081        | -          | -         | 0.510 |
| up_proj.V        | 1518.346     | -          | -         | 1.432 |
| gate_proj.V      | 1507.585     | -          | -         | 1.124 |
| up_proj.U        | 0.278        | -          | -         | 0.622 |
| gate_proj.U      | 0.305        | -          | -         | 0.389 |
| down_proj.V      | 20.009       | -          | -         | 5.052 |
| down_proj.U      | 0.006        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.007        | -          | -         | 1.375 |
| v_proj.V         | 0.006        | -          | -         | 1.108 |
| q_proj.V         | 0.006        | -          | -         | 1.114 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.002        | -          | -         | 1.365 |
| gate_proj.V      | 0.002        | -          | -         | 1.118 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.957 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3245.989     | -          | -         | 1.455 |
| v_proj.V         | 3297.087     | -          | -         | 1.122 |
| q_proj.V         | 3285.038     | -          | -         | 1.125 |
| k_proj.U         | 0.094        | -          | -         | 0.477 |
| v_proj.U         | 0.062        | -          | -         | 0.269 |
| q_proj.U         | 0.323        | -          | -         | 0.269 |
| o_proj.V         | 335.987      | -          | -         | 1.399 |
| o_proj.U         | 0.086        | -          | -         | 0.508 |
| up_proj.V        | 1647.722     | -          | -         | 1.432 |
| gate_proj.V      | 1639.586     | -          | -         | 1.123 |
| up_proj.U        | 0.334        | -          | -         | 0.620 |
| gate_proj.U      | 0.355        | -          | -         | 0.387 |
| down_proj.V      | 27.085       | -          | -         | 5.058 |
| down_proj.U      | 0.010        | -          | -         | 0.637 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.007        | -          | -         | 1.378 |
| v_proj.V         | 0.007        | -          | -         | 1.115 |
| q_proj.V         | 0.007        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.343 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.003        | -          | -         | 1.366 |
| gate_proj.V      | 0.003        | -          | -         | 1.115 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.973 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3838.294     | -          | -         | 1.447 |
| v_proj.V         | 3892.341     | -          | -         | 1.128 |
| q_proj.V         | 3875.025     | -          | -         | 1.123 |
| k_proj.U         | 0.102        | -          | -         | 0.477 |
| v_proj.U         | 0.074        | -          | -         | 0.268 |
| q_proj.U         | 0.344        | -          | -         | 0.269 |
| o_proj.V         | 522.349      | -          | -         | 1.411 |
| o_proj.U         | 0.115        | -          | -         | 0.510 |
| up_proj.V        | 2202.113     | -          | -         | 1.435 |
| gate_proj.V      | 2191.297     | -          | -         | 1.128 |
| up_proj.U        | 0.452        | -          | -         | 0.624 |
| gate_proj.U      | 0.489        | -          | -         | 0.388 |
| down_proj.V      | 78.625       | -          | -         | 5.066 |
| down_proj.U      | 0.022        | -          | -         | 0.641 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.383 |
| v_proj.V         | 0.008        | -          | -         | 1.115 |
| q_proj.V         | 0.008        | -          | -         | 1.119 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.335 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.003        | -          | -         | 1.367 |
| gate_proj.V      | 0.003        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.980 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4225.410     | -          | -         | 1.441 |
| v_proj.V         | 4275.237     | -          | -         | 1.124 |
| q_proj.V         | 4255.266     | -          | -         | 1.123 |
| k_proj.U         | 0.133        | -          | -         | 0.476 |
| v_proj.U         | 0.096        | -          | -         | 0.269 |
| q_proj.U         | 0.444        | -          | -         | 0.280 |
| o_proj.V         | 398.228      | -          | -         | 1.418 |
| o_proj.U         | 0.079        | -          | -         | 0.509 |
| up_proj.V        | 2947.192     | -          | -         | 1.434 |
| gate_proj.V      | 2936.867     | -          | -         | 1.122 |
| up_proj.U        | 0.607        | -          | -         | 0.623 |
| gate_proj.U      | 0.650        | -          | -         | 0.387 |
| down_proj.V      | 67.522       | -          | -         | 5.055 |
| down_proj.U      | 0.024        | -          | -         | 0.637 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.009        | -          | -         | 1.378 |
| v_proj.V         | 0.009        | -          | -         | 1.112 |
| q_proj.V         | 0.009        | -          | -         | 1.116 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.340 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.004        | -          | -         | 1.361 |
| gate_proj.V      | 0.004        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.977 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4657.963     | -          | -         | 1.444 |
| v_proj.V         | 4711.588     | -          | -         | 1.121 |
| q_proj.V         | 4665.465     | -          | -         | 1.138 |
| k_proj.U         | 0.133        | -          | -         | 0.476 |
| v_proj.U         | 0.102        | -          | -         | 0.267 |
| q_proj.U         | 0.471        | -          | -         | 0.267 |
| o_proj.V         | 424.848      | -          | -         | 1.399 |
| o_proj.U         | 0.103        | -          | -         | 0.511 |
| up_proj.V        | 3411.166     | -          | -         | 1.430 |
| gate_proj.V      | 3397.831     | -          | -         | 1.129 |
| up_proj.U        | 0.668        | -          | -         | 0.622 |
| gate_proj.U      | 0.702        | -          | -         | 0.387 |
| down_proj.V      | 77.868       | -          | -         | 5.054 |
| down_proj.U      | 0.026        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.009        | -          | -         | 1.378 |
| v_proj.V         | 0.009        | -          | -         | 1.111 |
| q_proj.V         | 0.009        | -          | -         | 1.113 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.346 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.005        | -          | -         | 1.365 |
| gate_proj.V      | 0.005        | -          | -         | 1.117 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.969 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4871.638     | -          | -         | 1.440 |
| v_proj.V         | 4917.808     | -          | -         | 1.126 |
| q_proj.V         | 4893.435     | -          | -         | 1.129 |
| k_proj.U         | 0.148        | -          | -         | 0.476 |
| v_proj.U         | 0.103        | -          | -         | 0.268 |
| q_proj.U         | 0.488        | -          | -         | 0.268 |
| o_proj.V         | 332.968      | -          | -         | 1.408 |
| o_proj.U         | 0.068        | -          | -         | 0.510 |
| up_proj.V        | 3780.338     | -          | -         | 1.434 |
| gate_proj.V      | 3773.954     | -          | -         | 1.124 |
| up_proj.U        | 0.721        | -          | -         | 0.625 |
| gate_proj.U      | 0.774        | -          | -         | 0.386 |
| down_proj.V      | 91.445       | -          | -         | 5.052 |
| down_proj.U      | 0.031        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.010        | -          | -         | 1.378 |
| v_proj.V         | 0.010        | -          | -         | 1.114 |
| q_proj.V         | 0.010        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.353 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.005        | -          | -         | 1.366 |
| gate_proj.V      | 0.005        | -          | -         | 1.119 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.975 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5059.244     | -          | -         | 1.442 |
| v_proj.V         | 5148.065     | -          | -         | 1.123 |
| q_proj.V         | 5081.328     | -          | -         | 1.127 |
| k_proj.U         | 0.160        | -          | -         | 0.476 |
| v_proj.U         | 0.110        | -          | -         | 0.270 |
| q_proj.U         | 0.498        | -          | -         | 0.269 |
| o_proj.V         | 402.311      | -          | -         | 1.418 |
| o_proj.U         | 0.078        | -          | -         | 0.518 |
| up_proj.V        | 4084.565     | -          | -         | 1.443 |
| gate_proj.V      | 4087.177     | -          | -         | 1.126 |
| up_proj.U        | 0.770        | -          | -         | 0.625 |
| gate_proj.U      | 0.818        | -          | -         | 0.387 |
| down_proj.V      | 84.079       | -          | -         | 5.071 |
| down_proj.U      | 0.029        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.009        | -          | -         | 1.381 |
| v_proj.V         | 0.009        | -          | -         | 1.136 |
| q_proj.V         | 0.009        | -          | -         | 1.121 |
| k_proj.U         | 0.000        | -          | -         | 0.156 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.005        | -          | -         | 1.366 |
| gate_proj.V      | 0.005        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.978 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4965.955     | -          | -         | 1.443 |
| v_proj.V         | 5043.463     | -          | -         | 1.121 |
| q_proj.V         | 4989.066     | -          | -         | 1.119 |
| k_proj.U         | 0.148        | -          | -         | 0.476 |
| v_proj.U         | 0.107        | -          | -         | 0.268 |
| q_proj.U         | 0.488        | -          | -         | 0.270 |
| o_proj.V         | 491.137      | -          | -         | 1.402 |
| o_proj.U         | 0.075        | -          | -         | 0.511 |
| up_proj.V        | 4385.553     | -          | -         | 1.440 |
| gate_proj.V      | 4365.603     | -          | -         | 1.122 |
| up_proj.U        | 0.785        | -          | -         | 0.623 |
| gate_proj.U      | 0.838        | -          | -         | 0.388 |
| down_proj.V      | 108.854      | -          | -         | 5.055 |
| down_proj.U      | 0.031        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.010        | -          | -         | 1.381 |
| v_proj.V         | 0.010        | -          | -         | 1.114 |
| q_proj.V         | 0.010        | -          | -         | 1.115 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.341 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.006        | -          | -         | 1.370 |
| gate_proj.V      | 0.006        | -          | -         | 1.118 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.962 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5427.047     | -          | -         | 1.444 |
| v_proj.V         | 5456.712     | -          | -         | 1.124 |
| q_proj.V         | 5401.961     | -          | -         | 1.122 |
| k_proj.U         | 0.165        | -          | -         | 0.476 |
| v_proj.U         | 0.115        | -          | -         | 0.270 |
| q_proj.U         | 0.551        | -          | -         | 0.271 |
| o_proj.V         | 486.998      | -          | -         | 1.404 |
| o_proj.U         | 0.078        | -          | -         | 0.511 |
| up_proj.V        | 4549.170     | -          | -         | 1.431 |
| gate_proj.V      | 4528.333     | -          | -         | 1.121 |
| up_proj.U        | 0.857        | -          | -         | 0.624 |
| gate_proj.U      | 0.910        | -          | -         | 0.387 |
| down_proj.V      | 92.125       | -          | -         | 5.076 |
| down_proj.U      | 0.028        | -          | -         | 0.652 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.010        | -          | -         | 1.381 |
| v_proj.V         | 0.010        | -          | -         | 1.117 |
| q_proj.V         | 0.010        | -          | -         | 1.131 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.006        | -          | -         | 1.369 |
| gate_proj.V      | 0.006        | -          | -         | 1.116 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.986 |
| down_proj.U      | 0.000        | -          | -         | 0.191 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5451.833     | -          | -         | 1.440 |
| v_proj.V         | 5452.167     | -          | -         | 1.127 |
| q_proj.V         | 5425.879     | -          | -         | 1.124 |
| k_proj.U         | 0.167        | -          | -         | 0.475 |
| v_proj.U         | 0.114        | -          | -         | 0.270 |
| q_proj.U         | 0.545        | -          | -         | 0.271 |
| o_proj.V         | 571.113      | -          | -         | 1.410 |
| o_proj.U         | 0.116        | -          | -         | 0.511 |
| up_proj.V        | 4749.321     | -          | -         | 1.435 |
| gate_proj.V      | 4722.080     | -          | -         | 1.134 |
| up_proj.U        | 0.898        | -          | -         | 0.635 |
| gate_proj.U      | 0.934        | -          | -         | 0.391 |
| down_proj.V      | 115.908      | -          | -         | 5.069 |
| down_proj.U      | 0.033        | -          | -         | 0.642 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.012        | -          | -         | 1.381 |
| v_proj.V         | 0.012        | -          | -         | 1.125 |
| q_proj.V         | 0.012        | -          | -         | 1.118 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.001        | -          | -         | 1.340 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.006        | -          | -         | 1.363 |
| gate_proj.V      | 0.006        | -          | -         | 1.119 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.967 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6201.947     | -          | -         | 1.441 |
| v_proj.V         | 6229.002     | -          | -         | 1.126 |
| q_proj.V         | 6198.996     | -          | -         | 1.126 |
| k_proj.U         | 0.196        | -          | -         | 0.476 |
| v_proj.U         | 0.138        | -          | -         | 0.270 |
| q_proj.U         | 0.635        | -          | -         | 0.269 |
| o_proj.V         | 657.005      | -          | -         | 1.409 |
| o_proj.U         | 0.135        | -          | -         | 0.512 |
| up_proj.V        | 5254.753     | -          | -         | 1.432 |
| gate_proj.V      | 5220.564     | -          | -         | 1.125 |
| up_proj.U        | 1.031        | -          | -         | 0.624 |
| gate_proj.U      | 1.093        | -          | -         | 0.388 |
| down_proj.V      | 98.481       | -          | -         | 5.061 |
| down_proj.U      | 0.030        | -          | -         | 0.639 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.012        | -          | -         | 1.385 |
| v_proj.V         | 0.012        | -          | -         | 1.128 |
| q_proj.V         | 0.012        | -          | -         | 1.124 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.002        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.007        | -          | -         | 1.368 |
| gate_proj.V      | 0.007        | -          | -         | 1.117 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.989 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6297.094     | -          | -         | 1.443 |
| v_proj.V         | 6329.323     | -          | -         | 1.124 |
| q_proj.V         | 6342.293     | -          | -         | 1.127 |
| k_proj.U         | 0.186        | -          | -         | 0.487 |
| v_proj.U         | 0.152        | -          | -         | 0.266 |
| q_proj.U         | 0.649        | -          | -         | 0.266 |
| o_proj.V         | 968.349      | -          | -         | 1.409 |
| o_proj.U         | 0.172        | -          | -         | 0.508 |
| up_proj.V        | 5791.739     | -          | -         | 1.435 |
| gate_proj.V      | 5776.984     | -          | -         | 1.116 |
| up_proj.U        | 1.125        | -          | -         | 0.624 |
| gate_proj.U      | 1.173        | -          | -         | 0.388 |
| down_proj.V      | 120.550      | -          | -         | 5.049 |
| down_proj.U      | 0.035        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.013        | -          | -         | 1.378 |
| v_proj.V         | 0.013        | -          | -         | 1.117 |
| q_proj.V         | 0.013        | -          | -         | 1.118 |
| k_proj.U         | 0.000        | -          | -         | 0.156 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.002        | -          | -         | 1.342 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.008        | -          | -         | 1.366 |
| gate_proj.V      | 0.008        | -          | -         | 1.115 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.975 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6891.582     | -          | -         | 1.446 |
| v_proj.V         | 6831.252     | -          | -         | 1.123 |
| q_proj.V         | 6885.511     | -          | -         | 1.128 |
| k_proj.U         | 0.203        | -          | -         | 0.478 |
| v_proj.U         | 0.155        | -          | -         | 0.268 |
| q_proj.U         | 0.696        | -          | -         | 0.270 |
| o_proj.V         | 1284.483     | -          | -         | 1.404 |
| o_proj.U         | 0.217        | -          | -         | 0.509 |
| up_proj.V        | 6315.728     | -          | -         | 1.436 |
| gate_proj.V      | 6284.592     | -          | -         | 1.130 |
| up_proj.U        | 1.196        | -          | -         | 0.625 |
| gate_proj.U      | 1.253        | -          | -         | 0.389 |
| down_proj.V      | 129.210      | -          | -         | 5.067 |
| down_proj.U      | 0.039        | -          | -         | 0.641 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.013        | -          | -         | 1.383 |
| v_proj.V         | 0.013        | -          | -         | 1.117 |
| q_proj.V         | 0.013        | -          | -         | 1.119 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.003        | -          | -         | 1.342 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.008        | -          | -         | 1.366 |
| gate_proj.V      | 0.008        | -          | -         | 1.113 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.980 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7085.168     | -          | -         | 1.442 |
| v_proj.V         | 7093.222     | -          | -         | 1.123 |
| q_proj.V         | 7097.977     | -          | -         | 1.127 |
| k_proj.U         | 0.187        | -          | -         | 0.478 |
| v_proj.U         | 0.156        | -          | -         | 0.269 |
| q_proj.U         | 0.658        | -          | -         | 0.270 |
| o_proj.V         | 1359.941     | -          | -         | 1.408 |
| o_proj.U         | 0.226        | -          | -         | 0.510 |
| up_proj.V        | 6378.685     | -          | -         | 1.442 |
| gate_proj.V      | 6351.762     | -          | -         | 1.140 |
| up_proj.U        | 1.206        | -          | -         | 0.624 |
| gate_proj.U      | 1.267        | -          | -         | 0.390 |
| down_proj.V      | 177.790      | -          | -         | 5.088 |
| down_proj.U      | 0.043        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.013        | -          | -         | 1.379 |
| v_proj.V         | 0.013        | -          | -         | 1.117 |
| q_proj.V         | 0.013        | -          | -         | 1.123 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.008 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.002        | -          | -         | 1.334 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.008        | -          | -         | 1.364 |
| gate_proj.V      | 0.008        | -          | -         | 1.111 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.975 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6714.950     | -          | -         | 1.441 |
| v_proj.V         | 6704.279     | -          | -         | 1.127 |
| q_proj.V         | 6717.277     | -          | -         | 1.126 |
| k_proj.U         | 0.169        | -          | -         | 0.477 |
| v_proj.U         | 0.133        | -          | -         | 0.269 |
| q_proj.U         | 0.609        | -          | -         | 0.271 |
| o_proj.V         | 1117.686     | -          | -         | 1.409 |
| o_proj.U         | 0.185        | -          | -         | 0.510 |
| up_proj.V        | 6803.727     | -          | -         | 1.451 |
| gate_proj.V      | 6765.659     | -          | -         | 1.127 |
| up_proj.U        | 1.254        | -          | -         | 0.622 |
| gate_proj.U      | 1.340        | -          | -         | 0.385 |
| down_proj.V      | 189.803      | -          | -         | 5.079 |
| down_proj.U      | 0.050        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.015        | -          | -         | 1.383 |
| v_proj.V         | 0.014        | -          | -         | 1.121 |
| q_proj.V         | 0.015        | -          | -         | 1.130 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.003        | -          | -         | 1.338 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.009        | -          | -         | 1.369 |
| gate_proj.V      | 0.009        | -          | -         | 1.117 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.989 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7349.373     | -          | -         | 1.452 |
| v_proj.V         | 7330.908     | -          | -         | 1.123 |
| q_proj.V         | 7334.079     | -          | -         | 1.127 |
| k_proj.U         | 0.187        | -          | -         | 0.477 |
| v_proj.U         | 0.142        | -          | -         | 0.267 |
| q_proj.U         | 0.603        | -          | -         | 0.271 |
| o_proj.V         | 1544.903     | -          | -         | 1.410 |
| o_proj.U         | 0.240        | -          | -         | 0.513 |
| up_proj.V        | 7025.568     | -          | -         | 1.435 |
| gate_proj.V      | 6993.888     | -          | -         | 1.138 |
| up_proj.U        | 1.268        | -          | -         | 0.625 |
| gate_proj.U      | 1.359        | -          | -         | 0.389 |
| down_proj.V      | 227.918      | -          | -         | 5.062 |
| down_proj.U      | 0.059        | -          | -         | 0.642 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.014        | -          | -         | 1.383 |
| v_proj.V         | 0.014        | -          | -         | 1.128 |
| q_proj.V         | 0.014        | -          | -         | 1.119 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.003        | -          | -         | 1.340 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.010        | -          | -         | 1.368 |
| gate_proj.V      | 0.009        | -          | -         | 1.118 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.000        | -          | -         | 4.989 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6946.806     | -          | -         | 1.443 |
| v_proj.V         | 6957.870     | -          | -         | 1.126 |
| q_proj.V         | 6946.798     | -          | -         | 1.126 |
| k_proj.U         | 0.161        | -          | -         | 0.477 |
| v_proj.U         | 0.136        | -          | -         | 0.269 |
| q_proj.U         | 0.561        | -          | -         | 0.273 |
| o_proj.V         | 1466.981     | -          | -         | 1.414 |
| o_proj.U         | 0.272        | -          | -         | 0.513 |
| up_proj.V        | 7783.274     | -          | -         | 1.433 |
| gate_proj.V      | 7718.160     | -          | -         | 1.121 |
| up_proj.U        | 1.399        | -          | -         | 0.624 |
| gate_proj.U      | 1.516        | -          | -         | 0.387 |
| down_proj.V      | 278.395      | -          | -         | 5.074 |
| down_proj.U      | 0.085        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.014        | -          | -         | 1.393 |
| v_proj.V         | 0.014        | -          | -         | 1.116 |
| q_proj.V         | 0.014        | -          | -         | 1.117 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.005        | -          | -         | 1.341 |
| o_proj.U         | 0.000        | -          | -         | 0.188 |
| up_proj.V        | 0.010        | -          | -         | 1.370 |
| gate_proj.V      | 0.010        | -          | -         | 1.130 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.001        | -          | -         | 4.980 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6932.123     | -          | -         | 1.445 |
| v_proj.V         | 7058.011     | -          | -         | 1.127 |
| q_proj.V         | 7000.202     | -          | -         | 1.124 |
| k_proj.U         | 0.170        | -          | -         | 0.476 |
| v_proj.U         | 0.124        | -          | -         | 0.270 |
| q_proj.U         | 0.574        | -          | -         | 0.269 |
| o_proj.V         | 2491.150     | -          | -         | 1.408 |
| o_proj.U         | 0.317        | -          | -         | 0.511 |
| up_proj.V        | 8313.954     | -          | -         | 1.438 |
| gate_proj.V      | 8290.964     | -          | -         | 1.137 |
| up_proj.U        | 1.499        | -          | -         | 0.624 |
| gate_proj.U      | 1.632        | -          | -         | 0.392 |
| down_proj.V      | 501.853      | -          | -         | 5.061 |
| down_proj.U      | 0.126        | -          | -         | 0.642 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.015        | -          | -         | 1.382 |
| v_proj.V         | 0.015        | -          | -         | 1.130 |
| q_proj.V         | 0.015        | -          | -         | 1.128 |
| k_proj.U         | 0.000        | -          | -         | 0.156 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.004        | -          | -         | 1.336 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.011        | -          | -         | 1.369 |
| gate_proj.V      | 0.011        | -          | -         | 1.118 |
| up_proj.U        | 0.000        | -          | -         | 0.174 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.002        | -          | -         | 4.965 |
| down_proj.U      | 0.000        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7014.930     | -          | -         | 1.441 |
| v_proj.V         | 7102.252     | -          | -         | 1.127 |
| q_proj.V         | 7109.186     | -          | -         | 1.130 |
| k_proj.U         | 0.191        | -          | -         | 0.477 |
| v_proj.U         | 0.148        | -          | -         | 0.269 |
| q_proj.U         | 0.623        | -          | -         | 0.270 |
| o_proj.V         | 2340.436     | -          | -         | 1.408 |
| o_proj.U         | 0.388        | -          | -         | 0.511 |
| up_proj.V        | 8231.801     | -          | -         | 1.429 |
| gate_proj.V      | 8188.530     | -          | -         | 1.121 |
| up_proj.U        | 1.446        | -          | -         | 0.622 |
| gate_proj.U      | 1.636        | -          | -         | 0.388 |
| down_proj.V      | 932.757      | -          | -         | 5.070 |
| down_proj.U      | 0.178        | -          | -         | 0.641 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.009        | -          | -         | 1.387 |
| v_proj.V         | 0.009        | -          | -         | 1.117 |
| q_proj.V         | 0.009        | -          | -         | 1.123 |
| k_proj.U         | 0.000        | -          | -         | 0.155 |
| v_proj.U         | 0.000        | -          | -         | 0.006 |
| q_proj.U         | 0.000        | -          | -         | 0.006 |
| o_proj.V         | 0.007        | -          | -         | 1.339 |
| o_proj.U         | 0.000        | -          | -         | 0.189 |
| up_proj.V        | 0.009        | -          | -         | 1.370 |
| gate_proj.V      | 0.009        | -          | -         | 1.120 |
| up_proj.U        | 0.000        | -          | -         | 0.173 |
| gate_proj.U      | 0.000        | -          | -         | 0.006 |
| down_proj.V      | 0.016        | -          | -         | 4.972 |
| down_proj.U      | 0.001        | -          | -         | 0.190 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3838.401     | -          | -         | 1.439 |
| v_proj.V         | 3888.635     | -          | -         | 1.125 |
| q_proj.V         | 3893.885     | -          | -         | 1.131 |
| k_proj.U         | 0.087        | -          | -         | 0.478 |
| v_proj.U         | 0.069        | -          | -         | 0.269 |
| q_proj.U         | 0.341        | -          | -         | 0.270 |
| o_proj.V         | 1290.637     | -          | -         | 1.412 |
| o_proj.U         | 0.160        | -          | -         | 0.509 |
| up_proj.V        | 5904.308     | -          | -         | 1.436 |
| gate_proj.V      | 5922.871     | -          | -         | 1.125 |
| up_proj.U        | 0.979        | -          | -         | 0.626 |
| gate_proj.U      | 1.040        | -          | -         | 0.388 |
| down_proj.V      | 1000.789     | -          | -         | 5.068 |
| down_proj.U      | 0.130        | -          | -         | 0.640 |
+------------------+--------------+------------+-----------+-------+


2990.286766767502
usage: load_delta.py [-h] [--use_svd] [--merge] [--dim DIM]
                     [--delta_path DELTA_PATH]
                     [--compressed_delta_path COMPRESSED_DELTA_PATH]
                     [--save_path SAVE_PATH] [--fintuned_model FINTUNED_MODEL]
                     [--base_model BASE_MODEL]
load_delta.py: error: argument --base_model: expected one argument
