An Application of Program Slicing and CodeBERT to Distill Variables With Inappropriate Names

Published: 2024, Last Modified: 06 Mar 2025SERA 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Variables are essential for handling objects and data in a program, and their names can provide helpful clues to understanding the program. Well-chosen names enhance code readability. On the other hand, ill-chosen names hinder the comprehension of the program or cause misunderstanding. Al-though a variable's name is worthy of attention, it is challenging to judge whether it is appropriate or not automatically. This paper proposes a method for checking variable names using the program slicing technique and CodeBERT to automate the assessment of variable names. Given a variable in a program, the proposed method extracts the program slice regarding the variable and masks the variable's name. Then, that method tries to predict the masked part using CodeBERT and assesses the original name's adequacy by comparing the predicted names with the original name. A case study shows that the proposed method may detect ill-chosen names with high accuracy (higher than 0.9).
Loading