Classifying Code Comments via Pre-trained Programming Language ModelDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 16 Sept 2023NLBSE@ICSE 2023Readers: Everyone
Abstract: Previous studies have categorized code comments for various programming languages to produce high-quality code comments that can improve code readability and benefit maintenance. However, it still requires more effort to identify the main information contained in code comments. Pre-trained language model has shown promising results for solving software engineering tasks. In this paper, we propose a model for code comment classified using the recent pre-trained language model specialized for code-specific tasks (i.e., CodeT5). We introduce expert-predefined features to enhance the model's classification performance. Our evaluation on the official dataset shows that it outperforms the baseline by improving the precision (+65.9 %), recall (+147.3%) and the Fl-score (+112.5%) of the classification.
0 Replies

Loading