+-------------+--------+----------+-------------+----------+
|             | Random | Semantic | Tokens Only | Base LLM |
+-------------+--------+----------+-------------+----------+
|   Accuracy  | 50.0%  |  72.7%   |    70.9%    |  68.5%   |
|   Balanced  | 50.0%  |  68.9%   |    70.5%    |    -     |
|  Precision  | 40.0%  |  73.3%   |    62.5%    |    -     |
|    Recall   | 50.0%  |  50.0%   |    68.2%    |    -     |
|   F1 Score  | 44.4%  |  59.5%   |    65.2%    |    -     |
| Specificity | 50.0%  |  87.9%   |    72.7%    |    -     |
|     NPV     | 60.0%  |  72.5%   |    77.4%    |    -     |
+-------------+--------+----------+-------------+----------+