KL Divergence Comparison for BreakHis (averaged over 1 runs):
+---------------------------+---------------------+-----------------------+
| Method                    | Average KL (Prob)   | Average KL (Argmax)   |
+===========================+=====================+=======================+
| Original                  | 1.00e-01 ± 0.00e+00 | 1.83e-01 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+
| Arch Mod                  | 9.92e-02 ± 0.00e+00 | 1.52e-01 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+
| Replace Mean              | 2.30e-01 ± 0.00e+00 | 4.30e-01 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+
| PatchCutout-trained Model | 1.78e-02 ± 0.00e+00 | 2.28e-02 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+
| Temperature Scaling       | 7.11e-02 ± 0.00e+00 | 1.83e-01 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+
| Platt Scaline             | 1.17e-01 ± 0.00e+00 |  1.49e-01 ± 0.00e+00  |
+---------------------------+---------------------+-----------------------+
| MCal_CE (Cross-Entropy)   | 7.67e-05 ± 0.00e+00 | 2.70e-03 ± 0.00e+00   |
+---------------------------+---------------------+-----------------------+