Detecting and Fixing Nonidiomatic Snippets in Python Source Code with Deep Learning

Balázs Szalontai, András Vadász, Zsolt Richárd Borsi, Teréz A. Várkonyi, Balázs Pintér, Tibor Gregorics

2021 (modified: 20 Sept 2021)IntelliSys (1) 2021Readers: Everyone

Abstract: We present an algorithm to find nonidiomatic snippets in Python source code and replace them with cleaner and more performant alternatives. The algorithm is divided into three subtasks: (i) the snippets are localized, then for each snippet (ii) the type of the nonidiomatic pattern, and (iii) the key variables are determined. The subtasks of localizing patterns and extracting variables are solved as Sequence Tagging tasks with LSTM networks. Determining the nonidiomatic pattern type is a classification problem, which we solve by training a feedforward neural network. We evaluate the process on a dataset containing more than 13 000 programs coded by students.

0 Replies