A Robust Objective Focused Algorithm to Detect Source Code Plagiarism

Published: 2022, Last Modified: 27 Feb 2026UEMCON 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Detecting Source code plagiarism can be useful to properly assess students’ honest efforts and filter out suspiciously similar codes that indicate plagiarism. An automated source code plagiarism identifier tool can help reduce the effort required for manually checking each program and identifying suspicious subgroups. In this study, a robust and easily implementable but efficient approach has been proposed to measure the similarity between two source programs. The solution first converts the program into a simple data representation format based on the written code structure. Then it performs string matching over the compared source programs to maximize the amount of matching. The proposed solution also uses a weighted scoring mechanism to denote the amount of similarity found between two programs. From the experimental analysis, it can be observed that the designed solution can capture different types of plagiarism including function reordering, variable name changing, etc. Overall, it provides good precision(82%) and recall score(76%) compared to the prominent base algorithms and the usage of memory and time is quite satisfactory and does not create a bottleneck.
Loading