Keywords: AI, Metrics, Risk, Responsible AI, Finance
Abstract: As Generative Artificial Intelligence is adopted across the financial services indus-
try, a significant barrier to adoption and usage is measuring model performance.
Historical machine learning metrics can oftentimes fail to generalize to GenAI
workloads and are often supplemented using Subject Matter Expert Evaluation.
Even in this combination, many projects fail to account for various unique risks
present in choosing specific metrics. Additionally, many widespread benchmarks
created by foundational research labs and educational institutions fail to generalize
to industrial use. This paper explains these challenges and provides a Risk As-
sessment Framework to allow for better application of SME and machine learning
Metrics.
Submission Number: 92
Loading