Word Embeddings for Comment Coherence

Alfonso Cimasa; Anna Corazza; Carmen Coviello; Giuseppe Scanniello

Word Embeddings for Comment Coherence

Alfonso Cimasa, Anna Corazza, Carmen Coviello, Giuseppe Scanniello

Published: 01 Jan 2019, Last Modified: 18 Apr 2024SEAA 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: During the evolution of software, it could happen that the information in the comments and in the associated source code are not aligned, so hampering the execution of software evolution and maintenance tasks. This kind of misalignment is known as lack of coherence and it can happen for several reasons, e.g., programmers modify the intent of source code while executing a maintenance task without updating its comment accordingly. We study the problem of detecting a lack of coherence between comments and source code by exploiting Word Embeddings (WEs). We present four models based on WE and tested these models using six different WE variants through an experiment conducted on a publicly available dataset. Results are compared against a baseline. The most important outcome is: the considered models and WE variants are more efficient in terms of execution time while maintaining performance very close to the baseline. The explanation for such an improvement is that WEs are able to concentrate the important information in a more compact input representation.

Loading