Abstract: This paper presents Deepsy, a Natural Language-based synthesizer to assist source code analysis. It takes English descriptions of to-be-found code patterns as its inputs, and automatically produces ASTMatcher expressions that are directly usable by LLVM/Clang to materialize intended code analysis. The code analysis domain features profuse complexities in data types and operations, which make it elusive for prior rule-based synthesizers to tackle. On the other hand, machine learning-based solutions are neither applicable due to the scarcity of well labeled examples. This paper presents how Deepsy addresses the challenges by leveraging deep Natural Language Processing (NLP) and creating a new technique named dependency tree-based co-evolvement.Deepsy features an effective design that seamlessly integrates Natural Language dependency analysis into code analysis and meanwhile synergizes it with type-based narrowing and domain-specific guidance. Deepsy achieves over 70.0% expression-level accuracy and 85.1% individual API-level accuracy, significantly outperforming previous solutions.
Loading