Automated Benchmarking of LLMs: Applying Regression to Estimate LLM Accuracy
Keywords: LLM Benchmarking, Precision Estimate, Regression
TL;DR: A novel framework to estimate LLM's performance on subjective tasks by leveraging LLM based evaluators and limited human annotations
Confirmation Of Submission Requirements: I submit a previously published paper. It was published in an archival peer–reviewed venue on or after September 8th 2024, I specify the DOI in the field below, and I submit the camera-ready version of the paper.
Submission Number: 242
Loading