Emerging Trends in LLM Benchmarking

Akshar Prabhu Desai, Ritu Prajapati, Tejasvi Ravi, Mohammad Luqman, Pranjul Yadav

Published: 2024, Last Modified: 18 May 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Traditionally, machine learning models that focused on specialized tasks facilitated straightforward evaluation. However, the evolution of Large Language Models, has increased the complexity w.r.t. performance measurement. Evaluation and benchmarking of large language model is a significant challenge due to their versatility and improved capability to perform a wide range of tasks. This manuscript examines existing literature for various benchmarks and identifies a comprehensive overview of the emerging trends in benchmarking methodology.