Scores: Accuracy / F1 Score.
| Model Grouping | Model Name | FPBopen_in_new | FiQA SAopen_in_new | TFNSopen_in_new | NWGIopen_in_new | NERopen_in_new | Headlineopen_in_new |
|---|---|---|---|---|---|---|---|
| Financial Models | FinGPT†open_in_new |
86.47 0.863 |
81.09 0.829 |
88.44 0.884 |
56.61 0.474 |
- - |
96.96 0.933 |
| BloombergGPT‡open_in_new | 51.07 | 75.07 | - | - | 60.82 | 82.20 | |
| Base Models | Llama 3.1 8Bopen_in_new |
68.73 0.677 |
46.55 0.557 |
69.97 0.683 |
43.86 0.583 |
48.89 0.569 |
45.34 0.558 |
| Llama 3.1 70Bopen_in_new |
74.50 0.736 |
47.27 0.565 |
68.42 0.686 |
50.14 0.596 |
46.28 0.454 |
71.68 0.729 |
|
| DeepSeek V3open_in_new |
78.76 0.764 |
60.43 0.686 |
84.38 0.846 |
7.44 0.097 |
40.82 0.360 |
76.06 0.779 |
|
| GPT-4oopen_in_new |
81.13 0.818 |
72.34 0.773 |
73.32 0.740 |
66.61 0.656 |
52.11 0.523 |
80.53 0.814 |
|
| Gemini 2.0 FLopen_in_new |
81.02 0.894 |
68.09 0.810 |
26.38 0.385 |
48.16 0.614 |
65.13 0.769 |
76.60 0.847 |
|
| Fine-tuned Models | Llama 3.1 8B LoRAopen_in_new |
85.64 0.922 |
81.28 0.884 |
88.02 0.932 |
54.16 0.690 |
98.05 0.981 |
84.66 0.852 |
| Llama 3.1 8B QLoRAopen_in_new |
84.16 0.909 |
78.30 0.874 |
83.84 0.910 |
49.96 0.645 |
96.63 0.966 |
88.03 0.886 |
|
| Llama 3.1 8B DoRAopen_in_new |
81.93 0.901 |
78.72 0.874 |
59.09 0.702 |
19.57 0.281 |
71.59 0.834 |
64.93 0.781 |
|
| Llama 3.1 8B rsLoRAopen_in_new |
82.84 0.853 |
73.19 0.806 |
59.51 0.655 |
35.80 0.464 |
95.92 0.963 |
71.75 0.828 |
|
| Gemini 2.0 FL N/A |
87.62 0.878 |
88.09 0.879 |
89.49 0.896 |
62.59 0.581 |
97.29 0.973 |
97.32 0.973 |