An Empirical Comparison of Pre-Trained Models of Source Code

Published: 2023, Last Modified: 11 Feb 2026ICSE 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: While a large number of pre-trained models of source code have been successfully developed and applied to a variety of software engineering (SE) tasks in recent years, our understanding of these pre-trained models is arguably fairly limited. With the goal of advancing our understanding of these models, we perform the first systematic empirical comparison of 19 recently-developed pre-trained models of source code on 13 SE tasks. To gain additional insights into these models, we adopt a recently -developed 4-dimensional categorization of pre-trained models, and subsequently investigate whether there are correlations between different categories of pre-trained models and their performances on different SE tasks.
Loading