+-------------+--------+----------+-------------+----------+----------------+
|             | Random | Semantic | Tokens Only | Base LLM | Fine-Tuned LLM |
+-------------+--------+----------+-------------+----------+----------------+
|   Accuracy  | 50.0%  |  69.0%   |    64.8%    |    -     |       -        |
|  Precision  | 37.9%  |  67.7%   |    53.1%    |    -     |       -        |
|    Recall   | 50.0%  |  35.1%   |    60.4%    |    -     |       -        |
|   F1 Score  | 43.1%  |  46.2%   |    56.5%    |    -     |       -        |
| Specificity | 50.0%  |  89.8%   |    67.5%    |    -     |       -        |
|     NPV     | 62.1%  |  69.4%   |    73.6%    |    -     |       -        |
+-------------+--------+----------+-------------+----------+----------------+