Semantic Patch Embedding for Security Detection: A Fine-to-Coarse Grained Approach

Published: 19 Mar 2024, Last Modified: 19 Mar 2024Tiny Papers @ ICLR 2024 ArchiveEveryoneRevisionsBibTeXCC BY 4.0
Keywords: patch detection software PatchDB
Abstract: The surge in open-source software usage has heightened the risk of concealed vulnerabilities impacting downstream applications. Compounded by vendors silently releasing security patches without explicit notifications, users remain unaware of potential threats, providing attackers with opportunities. In navigating the intricate realm of software patches, understanding patch semantics becomes crucial for secure maintenance. Addressing this, we present MultiSEM, a multi-level Semantic Embedder for security patch detection. It employs fine-grained word-centric vectors and coarse-grained code-line representations, integrating patch descriptions for a comprehensive semantic view. MultiSEM excels in security patch detection, outperforming state-of-the-art models by 22.46% on PatchDB and 9.21% on SPI-DB in F1 metric evaluations.
Submission Number: 249
Loading