Function Names: Quantifying the Relationship Between Identifiers and Their Functionality to Improve Them

Charis Charitsis, Chris Piech, John C. Mitchell

Published: 01 Jan 2022, Last Modified: 24 Jun 2023L@S 2022Readers: Everyone

Abstract: When students first learn to program, they often focus on functionality: does a program work? In an era where software volume and complexity increase exponentially, it is equally important that they learn to write programs with style so that they are readable and extendable. Writing quality code starts with the building blocks for any program, its functions. A carefully chosen name is vital for program maintainability and manageability. The identifier is the most portable and concise way to summarize what the function does. What makes for the right choice? And can we automatically assess the quality of function names? Using natural language processing, we were able to create a probabilistic model to evaluate their clarity. Using functionality encodings, we attempt to learn the relationship between functions in different programs to improve their names. We analyzed a total of 5,400 programs tackling five novice programming tasks submitted by over 1,000 students in CS1. We developed a software system to automate labor-intensive tasks, detect poor function names and recommend replacements. Our findings suggest that less than 2.5% of name substitutions have an adverse outcome, and in most cases, more than 50% result in an improvement.

0 Replies