From Parameters to Performance: Diving into LLM Development and Structure

From Parameters to Performance: Diving into LLM Development and Structure

ACL ARR 2025 February Submission2849 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have achieved remarkable success across various domains, driving significant technological advancements and innovations in applications. Despite the rapid growth in model scale and capability, systematic research on how structural configurations affect performance remains limited. To address this gap, we presents a large-scale dataset encompassing different pen-source LLM structures and their performance across multiple benchmarks. Furthermore, we provide a systematic analysis of the dataset from a data mining perspective. We begin by reviewing the historical development of LLMs and discussing potential future trends. We then investigate the impact of various structural configurations on performance across different benchmarks. Finally, we employ mechanistic interpretability techniques to validate our findings mined from the dataset. Our goal is to provide data-driven insights for optimizing LLMs and offer valuable guidance for the targeted development and application of future models.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: Interpretability and Analysis of Models for NLP, Resources and Evaluation

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 2849

Loading