STACC: Code Comment Classification using SentenceTransformers

A. Al-Kaswan, M. Izadi, A. van Deursen

Published: 01 Jan 2023, Last Modified: 02 Apr 20262023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE)EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: <p>Code comments are a key resource for information about software artefacts. Depending on the use case, only some types of comments are useful. Thus, automatic approaches to clas-sify these comments have been proposed. In this work, we address this need by proposing, STACC, a set of SentenceTransformers- based binary classifiers. These lightweight classifiers are trained and tested on the NLBSE Code Comment Classification tool competition dataset, and surpass the baseline by a significant margin, achieving an average Fl score of 0.74 against the baseline of 0.31, which is an improvement of 139%. A replication package, as well as the models themselves, are publicly available.</p>
Loading