Abstract: This paper proposes a novel approach for identifying software bugs by building on a meaningful combination of word embeddings, graph-based text representations and graph attention networks. Existing approaches aim to advance each of the above components individually, without considering an integrative approach. As a result, they ignore information that is related to either the structure of a given text or an individual word of the text. Instead, our approach seamlessly incorporates both semantic and structural characteristics into a graph, which are then fed to a graph attention network in order to classify GitHub issues as bugs or features. Our experimental results demonstrate a significant improvement in terms of accuracy, precision and recall of the proposed approach compared to a list of classical and graph-based machine learning models. The dataset for the experiments reported in this paper has been retrieved from the kaggle.com platform and concerns GitHub issues with short-text attributes.
Loading