KL Divergence Comparison for MedQA (averaged over 1 runs):
+----------------+---------------------+-----------------------+
| Method         | Average KL (Prob)   | Average KL (Argmax)   |
+================+=====================+=======================+
| Original       | 6.38e-01 ± 0.00e+00 | 1.40e+00 ± 0.00e+00   |
+----------------+---------------------+-----------------------+
| Token Dropping | 3.23e-01 ± 0.00e+00 | 9.73e-01 ± 0.00e+00   |
+----------------+---------------------+-----------------------+