Abstract: This paper concerns using the code2vec vector embeddings of the source code to improve automatic source code generation in Grammatical Evolution. Focusing on a particular programming language, such as Java in the research presented, and being able to represent each Java function in the form of a continuous vector in a linear space by the code2vec model, GE gains some additional knowledge on similarities between constructed functions in the linear space instead of semantic similarities, which are harder to process. We propose a few improvements to the regular GE algorithm, including a code2vec-based initialization of the evolutionary algorithm and a code2vec-based crossover operator. Computational experiments confirm the efficiency of the approach proposed on a few typical benchmarks.
Loading