factor 8.0
Unrecognized keys in `rope_scaling` for 'rope_type'='linear': {'type'}
Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]Loading checkpoint shards:  50%|█████     | 2/4 [00:00<00:00, 13.29it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.49it/s]Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 14.28it/s]
Starting ...
Ready.
Quantizing 8bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.499 |
| v_proj.V         | 0.002        | -          | -         | 1.102 |
| q_proj.V         | 0.004        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.169 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.000        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.183 |
| up_proj.V        | 0.003        | -          | -         | 1.349 |
| gate_proj.V      | 0.003        | -          | -         | 1.102 |
| up_proj.U        | 0.000        | -          | -         | 0.167 |
| gate_proj.U      | 0.000        | -          | -         | 0.003 |
| down_proj.V      | 0.000        | -          | -         | 4.895 |
| down_proj.U      | 0.000        | -          | -         | 0.186 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 1/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 7262.053     | -          | -         | 1.430 |
| v_proj.V         | 3930.648     | -          | -         | 1.111 |
| q_proj.V         | 6630.202     | -          | -         | 1.114 |
| k_proj.U         | 0.222        | -          | -         | 0.478 |
| v_proj.U         | 0.201        | -          | -         | 0.266 |
| q_proj.U         | 1.018        | -          | -         | 0.267 |
| o_proj.V         | 23.988       | -          | -         | 1.395 |
| o_proj.U         | 0.018        | -          | -         | 0.513 |
| up_proj.V        | 3254.512     | -          | -         | 1.419 |
| gate_proj.V      | 3252.451     | -          | -         | 1.109 |
| up_proj.U        | 3.611        | -          | -         | 0.638 |
| gate_proj.U      | 3.586        | -          | -         | 0.402 |
| down_proj.V      | 19.367       | -          | -         | 4.997 |
| down_proj.U      | 0.031        | -          | -         | 0.654 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.002        | -          | -         | 1.369 |
| v_proj.V         | 0.002        | -          | -         | 1.101 |
| q_proj.V         | 0.002        | -          | -         | 1.101 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.000        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.323 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.004        | -          | -         | 1.349 |
| gate_proj.V      | 0.005        | -          | -         | 1.102 |
| up_proj.U        | 0.000        | -          | -         | 0.168 |
| gate_proj.U      | 0.000        | -          | -         | 0.003 |
| down_proj.V      | 0.000        | -          | -         | 4.897 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 2/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4775.394     | -          | -         | 1.431 |
| v_proj.V         | 2912.335     | -          | -         | 1.109 |
| q_proj.V         | 3324.787     | -          | -         | 1.109 |
| k_proj.U         | 0.354        | -          | -         | 0.482 |
| v_proj.U         | 0.355        | -          | -         | 0.266 |
| q_proj.U         | 2.091        | -          | -         | 0.268 |
| o_proj.V         | 21.858       | -          | -         | 1.399 |
| o_proj.U         | 0.022        | -          | -         | 0.515 |
| up_proj.V        | 5861.714     | -          | -         | 1.424 |
| gate_proj.V      | 5868.234     | -          | -         | 1.113 |
| up_proj.U        | 6.857        | -          | -         | 0.637 |
| gate_proj.U      | 6.867        | -          | -         | 0.401 |
| down_proj.V      | 22.910       | -          | -         | 5.032 |
| down_proj.U      | 0.023        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.016        | -          | -         | 1.367 |
| v_proj.V         | 0.017        | -          | -         | 1.102 |
| q_proj.V         | 0.016        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.324 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.005        | -          | -         | 1.353 |
| gate_proj.V      | 0.005        | -          | -         | 1.104 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.003 |
| down_proj.V      | 0.000        | -          | -         | 4.913 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 3/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 11560.191    | -          | -         | 1.429 |
| v_proj.V         | 10917.888    | -          | -         | 1.112 |
| q_proj.V         | 11088.107    | -          | -         | 1.107 |
| k_proj.U         | 1.431        | -          | -         | 0.482 |
| v_proj.U         | 1.363        | -          | -         | 0.269 |
| q_proj.U         | 5.405        | -          | -         | 0.268 |
| o_proj.V         | 199.907      | -          | -         | 1.399 |
| o_proj.U         | 0.228        | -          | -         | 0.517 |
| up_proj.V        | 3716.475     | -          | -         | 1.425 |
| gate_proj.V      | 3719.954     | -          | -         | 1.110 |
| up_proj.U        | 4.265        | -          | -         | 0.647 |
| gate_proj.U      | 4.259        | -          | -         | 0.402 |
| down_proj.V      | 14.085       | -          | -         | 5.025 |
| down_proj.U      | 0.024        | -          | -         | 0.657 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.008        | -          | -         | 1.369 |
| v_proj.V         | 0.008        | -          | -         | 1.102 |
| q_proj.V         | 0.008        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.325 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.003        | -          | -         | 1.352 |
| gate_proj.V      | 0.003        | -          | -         | 1.102 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.003 |
| down_proj.V      | 0.000        | -          | -         | 4.916 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 4/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 6402.876     | -          | -         | 1.426 |
| v_proj.V         | 6418.026     | -          | -         | 1.104 |
| q_proj.V         | 6391.315     | -          | -         | 1.099 |
| k_proj.U         | 1.215        | -          | -         | 0.481 |
| v_proj.U         | 0.824        | -          | -         | 0.266 |
| q_proj.U         | 3.323        | -          | -         | 0.265 |
| o_proj.V         | 233.543      | -          | -         | 1.395 |
| o_proj.U         | 0.334        | -          | -         | 0.515 |
| up_proj.V        | 2830.834     | -          | -         | 1.427 |
| gate_proj.V      | 2829.097     | -          | -         | 1.112 |
| up_proj.U        | 3.475        | -          | -         | 0.637 |
| gate_proj.U      | 3.468        | -          | -         | 0.403 |
| down_proj.V      | 13.314       | -          | -         | 5.004 |
| down_proj.U      | 0.025        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.367 |
| v_proj.V         | 0.003        | -          | -         | 1.103 |
| q_proj.V         | 0.004        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.000        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.323 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.002        | -          | -         | 1.353 |
| gate_proj.V      | 0.002        | -          | -         | 1.100 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.003 |
| down_proj.V      | 0.000        | -          | -         | 4.905 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 5/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3495.625     | -          | -         | 1.430 |
| v_proj.V         | 3478.467     | -          | -         | 1.111 |
| q_proj.V         | 3481.967     | -          | -         | 1.111 |
| k_proj.U         | 0.916        | -          | -         | 0.482 |
| v_proj.U         | 0.446        | -          | -         | 0.268 |
| q_proj.U         | 1.848        | -          | -         | 0.268 |
| o_proj.V         | 228.584      | -          | -         | 1.401 |
| o_proj.U         | 0.322        | -          | -         | 0.516 |
| up_proj.V        | 2382.717     | -          | -         | 1.424 |
| gate_proj.V      | 2382.578     | -          | -         | 1.115 |
| up_proj.U        | 3.067        | -          | -         | 0.637 |
| gate_proj.U      | 3.108        | -          | -         | 0.402 |
| down_proj.V      | 17.147       | -          | -         | 5.029 |
| down_proj.U      | 0.039        | -          | -         | 0.654 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.367 |
| v_proj.V         | 0.004        | -          | -         | 1.101 |
| q_proj.V         | 0.004        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.325 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.350 |
| gate_proj.V      | 0.002        | -          | -         | 1.103 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.920 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 6/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3678.076     | -          | -         | 1.432 |
| v_proj.V         | 3670.739     | -          | -         | 1.109 |
| q_proj.V         | 3677.374     | -          | -         | 1.110 |
| k_proj.U         | 0.659        | -          | -         | 0.482 |
| v_proj.U         | 0.537        | -          | -         | 0.267 |
| q_proj.U         | 2.188        | -          | -         | 0.269 |
| o_proj.V         | 208.882      | -          | -         | 1.402 |
| o_proj.U         | 0.361        | -          | -         | 0.516 |
| up_proj.V        | 2260.648     | -          | -         | 1.422 |
| gate_proj.V      | 2254.013     | -          | -         | 1.111 |
| up_proj.U        | 2.978        | -          | -         | 0.636 |
| gate_proj.U      | 3.003        | -          | -         | 0.401 |
| down_proj.V      | 19.359       | -          | -         | 5.006 |
| down_proj.U      | 0.047        | -          | -         | 0.655 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.365 |
| v_proj.V         | 0.003        | -          | -         | 1.105 |
| q_proj.V         | 0.003        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.002        | -          | -         | 0.004 |
| o_proj.V         | 0.000        | -          | -         | 1.324 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.350 |
| gate_proj.V      | 0.002        | -          | -         | 1.101 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.914 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 7/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2891.380     | -          | -         | 1.433 |
| v_proj.V         | 2902.164     | -          | -         | 1.104 |
| q_proj.V         | 2908.739     | -          | -         | 1.115 |
| k_proj.U         | 0.944        | -          | -         | 0.483 |
| v_proj.U         | 0.423        | -          | -         | 0.268 |
| q_proj.U         | 1.842        | -          | -         | 0.267 |
| o_proj.V         | 246.585      | -          | -         | 1.402 |
| o_proj.U         | 0.445        | -          | -         | 0.518 |
| up_proj.V        | 2100.365     | -          | -         | 1.421 |
| gate_proj.V      | 2100.815     | -          | -         | 1.118 |
| up_proj.U        | 2.919        | -          | -         | 0.638 |
| gate_proj.U      | 3.030        | -          | -         | 0.404 |
| down_proj.V      | 21.069       | -          | -         | 5.024 |
| down_proj.U      | 0.058        | -          | -         | 0.657 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.369 |
| v_proj.V         | 0.003        | -          | -         | 1.105 |
| q_proj.V         | 0.003        | -          | -         | 1.106 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.325 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.354 |
| gate_proj.V      | 0.001        | -          | -         | 1.106 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.918 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 8/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2651.105     | -          | -         | 1.428 |
| v_proj.V         | 2666.443     | -          | -         | 1.115 |
| q_proj.V         | 2659.368     | -          | -         | 1.120 |
| k_proj.U         | 0.462        | -          | -         | 0.482 |
| v_proj.U         | 0.378        | -          | -         | 0.267 |
| q_proj.U         | 1.638        | -          | -         | 0.268 |
| o_proj.V         | 244.286      | -          | -         | 1.407 |
| o_proj.U         | 0.487        | -          | -         | 0.516 |
| up_proj.V        | 1850.691     | -          | -         | 1.428 |
| gate_proj.V      | 1852.096     | -          | -         | 1.112 |
| up_proj.U        | 2.555        | -          | -         | 0.638 |
| gate_proj.U      | 2.703        | -          | -         | 0.403 |
| down_proj.V      | 19.512       | -          | -         | 5.026 |
| down_proj.U      | 0.058        | -          | -         | 0.655 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.367 |
| v_proj.V         | 0.003        | -          | -         | 1.101 |
| q_proj.V         | 0.003        | -          | -         | 1.104 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.000        | -          | -         | 0.004 |
| o_proj.V         | 0.000        | -          | -         | 1.325 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.349 |
| gate_proj.V      | 0.001        | -          | -         | 1.104 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.911 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 9/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2803.698     | -          | -         | 1.431 |
| v_proj.V         | 2803.427     | -          | -         | 1.107 |
| q_proj.V         | 2810.035     | -          | -         | 1.107 |
| k_proj.U         | 0.630        | -          | -         | 0.481 |
| v_proj.U         | 0.437        | -          | -         | 0.267 |
| q_proj.U         | 1.994        | -          | -         | 0.268 |
| o_proj.V         | 243.575      | -          | -         | 1.395 |
| o_proj.U         | 0.468        | -          | -         | 0.518 |
| up_proj.V        | 1732.972     | -          | -         | 1.427 |
| gate_proj.V      | 1729.881     | -          | -         | 1.114 |
| up_proj.U        | 2.228        | -          | -         | 0.637 |
| gate_proj.U      | 2.346        | -          | -         | 0.404 |
| down_proj.V      | 16.027       | -          | -         | 5.030 |
| down_proj.U      | 0.050        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.367 |
| v_proj.V         | 0.003        | -          | -         | 1.104 |
| q_proj.V         | 0.003        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.000        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.326 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.353 |
| gate_proj.V      | 0.001        | -          | -         | 1.104 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.917 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 10/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2642.603     | -          | -         | 1.432 |
| v_proj.V         | 2629.596     | -          | -         | 1.109 |
| q_proj.V         | 2642.451     | -          | -         | 1.113 |
| k_proj.U         | 0.810        | -          | -         | 0.481 |
| v_proj.U         | 0.377        | -          | -         | 0.268 |
| q_proj.U         | 1.749        | -          | -         | 0.267 |
| o_proj.V         | 263.568      | -          | -         | 1.395 |
| o_proj.U         | 0.504        | -          | -         | 0.517 |
| up_proj.V        | 1602.174     | -          | -         | 1.423 |
| gate_proj.V      | 1602.873     | -          | -         | 1.107 |
| up_proj.U        | 2.072        | -          | -         | 0.636 |
| gate_proj.U      | 2.163        | -          | -         | 0.402 |
| down_proj.V      | 16.782       | -          | -         | 5.010 |
| down_proj.U      | 0.050        | -          | -         | 0.655 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.365 |
| v_proj.V         | 0.003        | -          | -         | 1.104 |
| q_proj.V         | 0.003        | -          | -         | 1.103 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.324 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.354 |
| gate_proj.V      | 0.001        | -          | -         | 1.105 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.908 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 11/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3021.618     | -          | -         | 1.434 |
| v_proj.V         | 3024.350     | -          | -         | 1.109 |
| q_proj.V         | 3026.133     | -          | -         | 1.116 |
| k_proj.U         | 1.114        | -          | -         | 0.484 |
| v_proj.U         | 0.408        | -          | -         | 0.269 |
| q_proj.U         | 1.892        | -          | -         | 0.270 |
| o_proj.V         | 267.023      | -          | -         | 1.404 |
| o_proj.U         | 0.497        | -          | -         | 0.517 |
| up_proj.V        | 1569.257     | -          | -         | 1.427 |
| gate_proj.V      | 1568.514     | -          | -         | 1.112 |
| up_proj.U        | 2.032        | -          | -         | 0.639 |
| gate_proj.U      | 2.182        | -          | -         | 0.402 |
| down_proj.V      | 13.650       | -          | -         | 5.024 |
| down_proj.U      | 0.041        | -          | -         | 0.657 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.368 |
| v_proj.V         | 0.003        | -          | -         | 1.107 |
| q_proj.V         | 0.003        | -          | -         | 1.107 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.002        | -          | -         | 0.004 |
| o_proj.V         | 0.000        | -          | -         | 1.328 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.357 |
| gate_proj.V      | 0.001        | -          | -         | 1.108 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.926 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 12/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2857.703     | -          | -         | 1.433 |
| v_proj.V         | 2867.430     | -          | -         | 1.116 |
| q_proj.V         | 2863.005     | -          | -         | 1.115 |
| k_proj.U         | 0.532        | -          | -         | 0.483 |
| v_proj.U         | 0.373        | -          | -         | 0.269 |
| q_proj.U         | 1.604        | -          | -         | 0.268 |
| o_proj.V         | 241.253      | -          | -         | 1.403 |
| o_proj.U         | 0.511        | -          | -         | 0.518 |
| up_proj.V        | 1536.151     | -          | -         | 1.426 |
| gate_proj.V      | 1535.386     | -          | -         | 1.114 |
| up_proj.U        | 1.954        | -          | -         | 0.638 |
| gate_proj.U      | 2.095        | -          | -         | 0.402 |
| down_proj.V      | 14.211       | -          | -         | 5.015 |
| down_proj.U      | 0.042        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.002        | -          | -         | 1.370 |
| v_proj.V         | 0.002        | -          | -         | 1.105 |
| q_proj.V         | 0.002        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.327 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.355 |
| gate_proj.V      | 0.001        | -          | -         | 1.106 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.923 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 13/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2335.830     | -          | -         | 1.431 |
| v_proj.V         | 2357.476     | -          | -         | 1.112 |
| q_proj.V         | 2343.666     | -          | -         | 1.109 |
| k_proj.U         | 0.779        | -          | -         | 0.482 |
| v_proj.U         | 0.298        | -          | -         | 0.269 |
| q_proj.U         | 1.463        | -          | -         | 0.269 |
| o_proj.V         | 295.852      | -          | -         | 1.399 |
| o_proj.U         | 0.757        | -          | -         | 0.517 |
| up_proj.V        | 1463.561     | -          | -         | 1.427 |
| gate_proj.V      | 1459.768     | -          | -         | 1.115 |
| up_proj.U        | 2.257        | -          | -         | 0.638 |
| gate_proj.U      | 2.409        | -          | -         | 0.405 |
| down_proj.V      | 17.167       | -          | -         | 5.032 |
| down_proj.U      | 0.042        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.368 |
| v_proj.V         | 0.003        | -          | -         | 1.104 |
| q_proj.V         | 0.003        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.000        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.326 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.001        | -          | -         | 1.349 |
| gate_proj.V      | 0.001        | -          | -         | 1.103 |
| up_proj.U        | 0.000        | -          | -         | 0.169 |
| gate_proj.U      | 0.000        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.916 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 14/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3108.787     | -          | -         | 1.436 |
| v_proj.V         | 3119.034     | -          | -         | 1.111 |
| q_proj.V         | 3114.527     | -          | -         | 1.114 |
| k_proj.U         | 0.833        | -          | -         | 0.483 |
| v_proj.U         | 0.522        | -          | -         | 0.266 |
| q_proj.U         | 2.548        | -          | -         | 0.269 |
| o_proj.V         | 334.951      | -          | -         | 1.401 |
| o_proj.U         | 0.822        | -          | -         | 0.517 |
| up_proj.V        | 1564.684     | -          | -         | 1.428 |
| gate_proj.V      | 1566.998     | -          | -         | 1.109 |
| up_proj.U        | 2.583        | -          | -         | 0.638 |
| gate_proj.U      | 2.708        | -          | -         | 0.403 |
| down_proj.V      | 21.739       | -          | -         | 5.024 |
| down_proj.U      | 0.061        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.369 |
| v_proj.V         | 0.003        | -          | -         | 1.104 |
| q_proj.V         | 0.003        | -          | -         | 1.106 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.185 |
| up_proj.V        | 0.001        | -          | -         | 1.356 |
| gate_proj.V      | 0.001        | -          | -         | 1.107 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.916 |
| down_proj.U      | 0.000        | -          | -         | 0.187 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 15/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3540.377     | -          | -         | 1.434 |
| v_proj.V         | 3552.540     | -          | -         | 1.117 |
| q_proj.V         | 3549.879     | -          | -         | 1.117 |
| k_proj.U         | 0.831        | -          | -         | 0.483 |
| v_proj.U         | 0.529        | -          | -         | 0.269 |
| q_proj.U         | 2.522        | -          | -         | 0.269 |
| o_proj.V         | 428.441      | -          | -         | 1.409 |
| o_proj.U         | 0.874        | -          | -         | 0.518 |
| up_proj.V        | 1982.100     | -          | -         | 1.427 |
| gate_proj.V      | 1983.736     | -          | -         | 1.113 |
| up_proj.U        | 3.190        | -          | -         | 0.639 |
| gate_proj.U      | 3.465        | -          | -         | 0.402 |
| down_proj.V      | 51.565       | -          | -         | 5.050 |
| down_proj.U      | 0.121        | -          | -         | 0.657 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.369 |
| v_proj.V         | 0.003        | -          | -         | 1.103 |
| q_proj.V         | 0.003        | -          | -         | 1.102 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.002        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.323 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.355 |
| gate_proj.V      | 0.001        | -          | -         | 1.106 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.924 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 16/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3613.386     | -          | -         | 1.436 |
| v_proj.V         | 3642.441     | -          | -         | 1.112 |
| q_proj.V         | 3644.055     | -          | -         | 1.114 |
| k_proj.U         | 1.200        | -          | -         | 0.483 |
| v_proj.U         | 0.625        | -          | -         | 0.267 |
| q_proj.U         | 2.979        | -          | -         | 0.270 |
| o_proj.V         | 339.975      | -          | -         | 1.401 |
| o_proj.U         | 0.601        | -          | -         | 0.517 |
| up_proj.V        | 2461.454     | -          | -         | 1.430 |
| gate_proj.V      | 2467.816     | -          | -         | 1.112 |
| up_proj.U        | 3.864        | -          | -         | 0.639 |
| gate_proj.U      | 4.201        | -          | -         | 0.403 |
| down_proj.V      | 45.986       | -          | -         | 5.019 |
| down_proj.U      | 0.129        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.372 |
| v_proj.V         | 0.004        | -          | -         | 1.103 |
| q_proj.V         | 0.004        | -          | -         | 1.106 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.004 |
| o_proj.V         | 0.000        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.355 |
| gate_proj.V      | 0.002        | -          | -         | 1.107 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.928 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 17/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3956.085     | -          | -         | 1.437 |
| v_proj.V         | 3948.592     | -          | -         | 1.115 |
| q_proj.V         | 3957.739     | -          | -         | 1.111 |
| k_proj.U         | 1.120        | -          | -         | 0.483 |
| v_proj.U         | 0.647        | -          | -         | 0.269 |
| q_proj.U         | 3.136        | -          | -         | 0.269 |
| o_proj.V         | 388.792      | -          | -         | 1.405 |
| o_proj.U         | 0.861        | -          | -         | 0.518 |
| up_proj.V        | 2759.348     | -          | -         | 1.427 |
| gate_proj.V      | 2757.387     | -          | -         | 1.117 |
| up_proj.U        | 4.124        | -          | -         | 0.638 |
| gate_proj.U      | 4.425        | -          | -         | 0.404 |
| down_proj.V      | 49.571       | -          | -         | 5.044 |
| down_proj.U      | 0.133        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.370 |
| v_proj.V         | 0.004        | -          | -         | 1.107 |
| q_proj.V         | 0.004        | -          | -         | 1.109 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.328 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.353 |
| gate_proj.V      | 0.002        | -          | -         | 1.108 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.925 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 18/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 3997.283     | -          | -         | 1.436 |
| v_proj.V         | 4001.718     | -          | -         | 1.116 |
| q_proj.V         | 3985.416     | -          | -         | 1.116 |
| k_proj.U         | 1.043        | -          | -         | 0.485 |
| v_proj.U         | 0.686        | -          | -         | 0.269 |
| q_proj.U         | 3.293        | -          | -         | 0.270 |
| o_proj.V         | 312.877      | -          | -         | 1.408 |
| o_proj.U         | 0.563        | -          | -         | 0.519 |
| up_proj.V        | 3004.481     | -          | -         | 1.422 |
| gate_proj.V      | 3009.130     | -          | -         | 1.106 |
| up_proj.U        | 4.446        | -          | -         | 0.635 |
| gate_proj.U      | 4.794        | -          | -         | 0.400 |
| down_proj.V      | 58.685       | -          | -         | 5.026 |
| down_proj.U      | 0.157        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.368 |
| v_proj.V         | 0.004        | -          | -         | 1.105 |
| q_proj.V         | 0.004        | -          | -         | 1.108 |
| k_proj.U         | 0.000        | -          | -         | 0.154 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.000        | -          | -         | 1.330 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.002        | -          | -         | 1.353 |
| gate_proj.V      | 0.002        | -          | -         | 1.103 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.929 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 19/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4188.492     | -          | -         | 1.436 |
| v_proj.V         | 4196.216     | -          | -         | 1.112 |
| q_proj.V         | 4192.414     | -          | -         | 1.115 |
| k_proj.U         | 1.138        | -          | -         | 0.482 |
| v_proj.U         | 0.758        | -          | -         | 0.269 |
| q_proj.U         | 3.360        | -          | -         | 0.269 |
| o_proj.V         | 346.567      | -          | -         | 1.398 |
| o_proj.U         | 0.583        | -          | -         | 0.519 |
| up_proj.V        | 3320.643     | -          | -         | 1.435 |
| gate_proj.V      | 3319.695     | -          | -         | 1.112 |
| up_proj.U        | 4.788        | -          | -         | 0.638 |
| gate_proj.U      | 5.144        | -          | -         | 0.404 |
| down_proj.V      | 61.209       | -          | -         | 5.050 |
| down_proj.U      | 0.164        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.378 |
| v_proj.V         | 0.004        | -          | -         | 1.105 |
| q_proj.V         | 0.004        | -          | -         | 1.106 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.354 |
| gate_proj.V      | 0.002        | -          | -         | 1.108 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.956 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 20/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4174.632     | -          | -         | 1.436 |
| v_proj.V         | 4174.313     | -          | -         | 1.121 |
| q_proj.V         | 4160.391     | -          | -         | 1.116 |
| k_proj.U         | 1.205        | -          | -         | 0.483 |
| v_proj.U         | 0.784        | -          | -         | 0.270 |
| q_proj.U         | 3.630        | -          | -         | 0.270 |
| o_proj.V         | 425.692      | -          | -         | 1.408 |
| o_proj.U         | 0.622        | -          | -         | 0.518 |
| up_proj.V        | 3601.458     | -          | -         | 1.423 |
| gate_proj.V      | 3595.391     | -          | -         | 1.113 |
| up_proj.U        | 5.049        | -          | -         | 0.639 |
| gate_proj.U      | 5.301        | -          | -         | 0.406 |
| down_proj.V      | 78.386       | -          | -         | 5.043 |
| down_proj.U      | 0.180        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.372 |
| v_proj.V         | 0.004        | -          | -         | 1.106 |
| q_proj.V         | 0.004        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.326 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.357 |
| gate_proj.V      | 0.002        | -          | -         | 1.109 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.001        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.931 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 21/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4529.332     | -          | -         | 1.437 |
| v_proj.V         | 4533.412     | -          | -         | 1.109 |
| q_proj.V         | 4537.208     | -          | -         | 1.107 |
| k_proj.U         | 1.223        | -          | -         | 0.482 |
| v_proj.U         | 0.813        | -          | -         | 0.269 |
| q_proj.U         | 3.824        | -          | -         | 0.269 |
| o_proj.V         | 414.566      | -          | -         | 1.401 |
| o_proj.U         | 0.600        | -          | -         | 0.518 |
| up_proj.V        | 3711.550     | -          | -         | 1.434 |
| gate_proj.V      | 3712.151     | -          | -         | 1.116 |
| up_proj.U        | 5.390        | -          | -         | 0.639 |
| gate_proj.U      | 5.747        | -          | -         | 0.404 |
| down_proj.V      | 65.683       | -          | -         | 5.021 |
| down_proj.U      | 0.162        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.004        | -          | -         | 1.367 |
| v_proj.V         | 0.004        | -          | -         | 1.107 |
| q_proj.V         | 0.004        | -          | -         | 1.108 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.356 |
| gate_proj.V      | 0.002        | -          | -         | 1.106 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.926 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 22/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 4510.831     | -          | -         | 1.437 |
| v_proj.V         | 4515.819     | -          | -         | 1.113 |
| q_proj.V         | 4522.290     | -          | -         | 1.113 |
| k_proj.U         | 1.273        | -          | -         | 0.484 |
| v_proj.U         | 0.809        | -          | -         | 0.270 |
| q_proj.U         | 3.779        | -          | -         | 0.270 |
| o_proj.V         | 512.534      | -          | -         | 1.402 |
| o_proj.U         | 0.973        | -          | -         | 0.518 |
| up_proj.V        | 3794.051     | -          | -         | 1.430 |
| gate_proj.V      | 3800.402     | -          | -         | 1.114 |
| up_proj.U        | 5.710        | -          | -         | 0.639 |
| gate_proj.U      | 6.020        | -          | -         | 0.404 |
| down_proj.V      | 80.717       | -          | -         | 5.044 |
| down_proj.U      | 0.186        | -          | -         | 0.656 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.372 |
| v_proj.V         | 0.005        | -          | -         | 1.108 |
| q_proj.V         | 0.005        | -          | -         | 1.126 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.004 |
| o_proj.V         | 0.001        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.002        | -          | -         | 1.354 |
| gate_proj.V      | 0.003        | -          | -         | 1.105 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.933 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 23/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5183.763     | -          | -         | 1.435 |
| v_proj.V         | 5160.578     | -          | -         | 1.116 |
| q_proj.V         | 5170.664     | -          | -         | 1.116 |
| k_proj.U         | 1.482        | -          | -         | 0.483 |
| v_proj.U         | 1.019        | -          | -         | 0.268 |
| q_proj.U         | 4.343        | -          | -         | 0.270 |
| o_proj.V         | 535.111      | -          | -         | 1.402 |
| o_proj.U         | 1.021        | -          | -         | 0.518 |
| up_proj.V        | 4199.073     | -          | -         | 1.433 |
| gate_proj.V      | 4170.055     | -          | -         | 1.116 |
| up_proj.U        | 6.496        | -          | -         | 0.639 |
| gate_proj.U      | 6.778        | -          | -         | 0.404 |
| down_proj.V      | 71.056       | -          | -         | 5.021 |
| down_proj.U      | 0.173        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.370 |
| v_proj.V         | 0.005        | -          | -         | 1.107 |
| q_proj.V         | 0.005        | -          | -         | 1.105 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.331 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.003        | -          | -         | 1.354 |
| gate_proj.V      | 0.003        | -          | -         | 1.107 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.952 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 24/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5171.435     | -          | -         | 1.435 |
| v_proj.V         | 5151.308     | -          | -         | 1.112 |
| q_proj.V         | 5153.324     | -          | -         | 1.114 |
| k_proj.U         | 1.344        | -          | -         | 0.485 |
| v_proj.U         | 1.032        | -          | -         | 0.269 |
| q_proj.U         | 4.287        | -          | -         | 0.268 |
| o_proj.V         | 712.377      | -          | -         | 1.409 |
| o_proj.U         | 1.103        | -          | -         | 0.518 |
| up_proj.V        | 4507.618     | -          | -         | 1.428 |
| gate_proj.V      | 4497.966     | -          | -         | 1.119 |
| up_proj.U        | 6.825        | -          | -         | 0.641 |
| gate_proj.U      | 7.113        | -          | -         | 0.404 |
| down_proj.V      | 88.495       | -          | -         | 5.044 |
| down_proj.U      | 0.209        | -          | -         | 0.659 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.372 |
| v_proj.V         | 0.005        | -          | -         | 1.109 |
| q_proj.V         | 0.005        | -          | -         | 1.107 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.002        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.328 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.003        | -          | -         | 1.360 |
| gate_proj.V      | 0.003        | -          | -         | 1.110 |
| up_proj.U        | 0.001        | -          | -         | 0.169 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.927 |
| down_proj.U      | 0.000        | -          | -         | 0.189 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 25/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5494.309     | -          | -         | 1.432 |
| v_proj.V         | 5462.489     | -          | -         | 1.117 |
| q_proj.V         | 5484.474     | -          | -         | 1.138 |
| k_proj.U         | 1.456        | -          | -         | 0.496 |
| v_proj.U         | 1.006        | -          | -         | 0.268 |
| q_proj.U         | 4.298        | -          | -         | 0.270 |
| o_proj.V         | 977.039      | -          | -         | 1.403 |
| o_proj.U         | 1.411        | -          | -         | 0.518 |
| up_proj.V        | 4740.783     | -          | -         | 1.433 |
| gate_proj.V      | 4747.133     | -          | -         | 1.117 |
| up_proj.U        | 6.836        | -          | -         | 0.641 |
| gate_proj.U      | 7.095        | -          | -         | 0.405 |
| down_proj.V      | 91.285       | -          | -         | 5.043 |
| down_proj.U      | 0.219        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.370 |
| v_proj.V         | 0.005        | -          | -         | 1.103 |
| q_proj.V         | 0.005        | -          | -         | 1.107 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.001        | -          | -         | 1.326 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.003        | -          | -         | 1.352 |
| gate_proj.V      | 0.003        | -          | -         | 1.109 |
| up_proj.U        | 0.001        | -          | -         | 0.170 |
| gate_proj.U      | 0.002        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.923 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 26/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5561.206     | -          | -         | 1.435 |
| v_proj.V         | 5566.676     | -          | -         | 1.113 |
| q_proj.V         | 5573.531     | -          | -         | 1.110 |
| k_proj.U         | 1.334        | -          | -         | 0.483 |
| v_proj.U         | 1.046        | -          | -         | 0.267 |
| q_proj.U         | 4.164        | -          | -         | 0.269 |
| o_proj.V         | 1095.682     | -          | -         | 1.401 |
| o_proj.U         | 1.663        | -          | -         | 0.518 |
| up_proj.V        | 4795.858     | -          | -         | 1.430 |
| gate_proj.V      | 4784.691     | -          | -         | 1.116 |
| up_proj.U        | 6.922        | -          | -         | 0.638 |
| gate_proj.U      | 7.063        | -          | -         | 0.404 |
| down_proj.V      | 127.432      | -          | -         | 5.042 |
| down_proj.U      | 0.248        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.370 |
| v_proj.V         | 0.005        | -          | -         | 1.107 |
| q_proj.V         | 0.005        | -          | -         | 1.103 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.002        | -          | -         | 0.003 |
| o_proj.V         | 0.002        | -          | -         | 1.330 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.003        | -          | -         | 1.351 |
| gate_proj.V      | 0.003        | -          | -         | 1.109 |
| up_proj.U        | 0.002        | -          | -         | 0.169 |
| gate_proj.U      | 0.003        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.927 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 27/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5294.639     | -          | -         | 1.434 |
| v_proj.V         | 5275.446     | -          | -         | 1.110 |
| q_proj.V         | 5269.188     | -          | -         | 1.115 |
| k_proj.U         | 1.478        | -          | -         | 0.483 |
| v_proj.U         | 0.932        | -          | -         | 0.268 |
| q_proj.U         | 4.125        | -          | -         | 0.270 |
| o_proj.V         | 968.621      | -          | -         | 1.406 |
| o_proj.U         | 1.509        | -          | -         | 0.517 |
| up_proj.V        | 5079.370     | -          | -         | 1.428 |
| gate_proj.V      | 5072.116     | -          | -         | 1.110 |
| up_proj.U        | 6.911        | -          | -         | 0.638 |
| gate_proj.U      | 7.094        | -          | -         | 0.401 |
| down_proj.V      | 132.183      | -          | -         | 5.024 |
| down_proj.U      | 0.281        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.373 |
| v_proj.V         | 0.006        | -          | -         | 1.105 |
| q_proj.V         | 0.006        | -          | -         | 1.109 |
| k_proj.U         | 0.001        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.003 |
| q_proj.U         | 0.001        | -          | -         | 0.003 |
| o_proj.V         | 0.002        | -          | -         | 1.332 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.003        | -          | -         | 1.360 |
| gate_proj.V      | 0.003        | -          | -         | 1.107 |
| up_proj.U        | 0.003        | -          | -         | 0.169 |
| gate_proj.U      | 0.004        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.925 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 28/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5715.429     | -          | -         | 1.435 |
| v_proj.V         | 5705.815     | -          | -         | 1.113 |
| q_proj.V         | 5716.425     | -          | -         | 1.116 |
| k_proj.U         | 1.428        | -          | -         | 0.485 |
| v_proj.U         | 0.978        | -          | -         | 0.268 |
| q_proj.U         | 4.199        | -          | -         | 0.269 |
| o_proj.V         | 1244.063     | -          | -         | 1.403 |
| o_proj.U         | 1.920        | -          | -         | 0.519 |
| up_proj.V        | 5213.918     | -          | -         | 1.427 |
| gate_proj.V      | 5196.497     | -          | -         | 1.115 |
| up_proj.U        | 6.965        | -          | -         | 0.641 |
| gate_proj.U      | 7.188        | -          | -         | 0.403 |
| down_proj.V      | 159.561      | -          | -         | 5.044 |
| down_proj.U      | 0.337        | -          | -         | 0.659 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.372 |
| v_proj.V         | 0.005        | -          | -         | 1.104 |
| q_proj.V         | 0.006        | -          | -         | 1.110 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.002        | -          | -         | 0.003 |
| o_proj.V         | 0.002        | -          | -         | 1.328 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.004        | -          | -         | 1.355 |
| gate_proj.V      | 0.004        | -          | -         | 1.111 |
| up_proj.U        | 0.004        | -          | -         | 0.169 |
| gate_proj.U      | 0.005        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.934 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 29/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5419.936     | -          | -         | 1.437 |
| v_proj.V         | 5439.347     | -          | -         | 1.115 |
| q_proj.V         | 5439.148     | -          | -         | 1.112 |
| k_proj.U         | 1.312        | -          | -         | 0.484 |
| v_proj.U         | 0.828        | -          | -         | 0.268 |
| q_proj.U         | 3.759        | -          | -         | 0.270 |
| o_proj.V         | 1176.401     | -          | -         | 1.402 |
| o_proj.U         | 1.792        | -          | -         | 0.519 |
| up_proj.V        | 5826.509     | -          | -         | 1.430 |
| gate_proj.V      | 5825.389     | -          | -         | 1.116 |
| up_proj.U        | 7.798        | -          | -         | 0.640 |
| gate_proj.U      | 8.060        | -          | -         | 0.405 |
| down_proj.V      | 203.041      | -          | -         | 5.040 |
| down_proj.U      | 0.512        | -          | -         | 0.660 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.005        | -          | -         | 1.370 |
| v_proj.V         | 0.005        | -          | -         | 1.107 |
| q_proj.V         | 0.005        | -          | -         | 1.108 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.001        | -          | -         | 0.003 |
| q_proj.U         | 0.005        | -          | -         | 0.004 |
| o_proj.V         | 0.003        | -          | -         | 1.330 |
| o_proj.U         | 0.000        | -          | -         | 0.187 |
| up_proj.V        | 0.004        | -          | -         | 1.355 |
| gate_proj.V      | 0.004        | -          | -         | 1.106 |
| up_proj.U        | 0.005        | -          | -         | 0.169 |
| gate_proj.U      | 0.006        | -          | -         | 0.004 |
| down_proj.V      | 0.000        | -          | -         | 4.928 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 30/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5370.584     | -          | -         | 1.432 |
| v_proj.V         | 5397.923     | -          | -         | 1.112 |
| q_proj.V         | 5390.362     | -          | -         | 1.111 |
| k_proj.U         | 1.373        | -          | -         | 0.482 |
| v_proj.U         | 0.896        | -          | -         | 0.270 |
| q_proj.U         | 4.149        | -          | -         | 0.267 |
| o_proj.V         | 2196.218     | -          | -         | 1.404 |
| o_proj.U         | 2.612        | -          | -         | 0.518 |
| up_proj.V        | 6239.186     | -          | -         | 1.433 |
| gate_proj.V      | 6230.910     | -          | -         | 1.118 |
| up_proj.U        | 8.676        | -          | -         | 0.639 |
| gate_proj.U      | 8.987        | -          | -         | 0.404 |
| down_proj.V      | 383.159      | -          | -         | 5.042 |
| down_proj.U      | 0.752        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.006        | -          | -         | 1.372 |
| v_proj.V         | 0.006        | -          | -         | 1.106 |
| q_proj.V         | 0.006        | -          | -         | 1.103 |
| k_proj.U         | 0.000        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.004        | -          | -         | 0.004 |
| o_proj.V         | 0.002        | -          | -         | 1.329 |
| o_proj.U         | 0.000        | -          | -         | 0.186 |
| up_proj.V        | 0.004        | -          | -         | 1.358 |
| gate_proj.V      | 0.004        | -          | -         | 1.108 |
| up_proj.U        | 0.004        | -          | -         | 0.169 |
| gate_proj.U      | 0.005        | -          | -         | 0.004 |
| down_proj.V      | 0.001        | -          | -         | 4.930 |
| down_proj.U      | 0.000        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 31/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 5637.177     | -          | -         | 1.441 |
| v_proj.V         | 5681.985     | -          | -         | 1.116 |
| q_proj.V         | 5652.230     | -          | -         | 1.117 |
| k_proj.U         | 1.496        | -          | -         | 0.484 |
| v_proj.U         | 1.010        | -          | -         | 0.269 |
| q_proj.U         | 4.271        | -          | -         | 0.269 |
| o_proj.V         | 1984.172     | -          | -         | 1.405 |
| o_proj.U         | 2.872        | -          | -         | 0.518 |
| up_proj.V        | 6291.285     | -          | -         | 1.432 |
| gate_proj.V      | 6303.011     | -          | -         | 1.116 |
| up_proj.U        | 8.858        | -          | -         | 0.639 |
| gate_proj.U      | 9.340        | -          | -         | 0.404 |
| down_proj.V      | 727.483      | -          | -         | 5.035 |
| down_proj.U      | 1.169        | -          | -         | 0.657 |
+------------------+--------------+------------+-----------+-------+


Quantizing 8bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 0.003        | -          | -         | 1.372 |
| v_proj.V         | 0.003        | -          | -         | 1.104 |
| q_proj.V         | 0.003        | -          | -         | 1.111 |
| k_proj.U         | 0.001        | -          | -         | 0.153 |
| v_proj.U         | 0.000        | -          | -         | 0.004 |
| q_proj.U         | 0.008        | -          | -         | 0.003 |
| o_proj.V         | 0.003        | -          | -         | 1.331 |
| o_proj.U         | 0.001        | -          | -         | 0.186 |
| up_proj.V        | 0.003        | -          | -         | 1.355 |
| gate_proj.V      | 0.003        | -          | -         | 1.108 |
| up_proj.U        | 0.005        | -          | -         | 0.169 |
| gate_proj.U      | 0.009        | -          | -         | 0.004 |
| down_proj.V      | 0.007        | -          | -         | 4.932 |
| down_proj.U      | 0.003        | -          | -         | 0.188 |
+------------------+--------------+------------+-----------+-------+


Quantizing 2bit 32/32..
+------------------+--------------+------------+-----------+-------+
|       name       | weight_error | fp_inp_SNR | q_inp_SNR | time  |
+==================+==============+============+===========+=======+
| k_proj.V         | 2945.612     | -          | -         | 1.434 |
| v_proj.V         | 2985.145     | -          | -         | 1.108 |
| q_proj.V         | 2979.638     | -          | -         | 1.115 |
| k_proj.U         | 0.658        | -          | -         | 0.485 |
| v_proj.U         | 0.487        | -          | -         | 0.269 |
| q_proj.U         | 2.251        | -          | -         | 0.269 |
| o_proj.V         | 1256.279     | -          | -         | 1.406 |
| o_proj.U         | 1.554        | -          | -         | 0.517 |
| up_proj.V        | 4595.001     | -          | -         | 1.433 |
| gate_proj.V      | 4606.920     | -          | -         | 1.117 |
| up_proj.U        | 5.912        | -          | -         | 0.639 |
| gate_proj.U      | 6.338        | -          | -         | 0.405 |
| down_proj.V      | 775.967      | -          | -         | 5.042 |
| down_proj.U      | 0.963        | -          | -         | 0.658 |
+------------------+--------------+------------+-----------+-------+


3091.066708087921
