Code Representation Based on Hybrid Graph ModellingOpen Website

Published: 01 Jan 2021, Last Modified: 12 May 2023ICONIP (5) 2021Readers: Everyone
Abstract: Several sequence- or abstract syntax tree (AST)-based models have been proposed for modelling lexical-level and syntactic-level information of source code. However, an effective method of learning code semantic information is still lacking. Thus, we propose a novel code representation method based on hybrid graph modelling, called HGCR. HGCR is a code information extraction model. Specifically, in HGCR, two novel graphs, the Structure Graph (SG) and the Execution Data Flow Graph (EDFG), are first extracted from AST to model the syntactic structural and semantic information of source code, respectively. Then, two improved graph neural networks are applied to learn the graphs to obtain an effective code representation. We demonstrate the effectiveness of our model on two common code understanding tasks: code classification and code clone detection. Empirically, our model outperforms state-of-the-art models.
0 Replies

Loading