Abstract: With the advancement of cyber technology, proactive security methods such as adversary emulation and leveraging Cyber Threat Intelligence (CTI) have become increasingly essential. Currently, some methods have achieved automatic mapping of unstructured text Cyber Threat Intelligence to attack techniques that could facilitate proactive security. However, these methods do not consider the semantic relationships between CTI and attack techniques at different abstraction levels, which leads to poor performance in the classification. In this work, we propose a Hierarchy-aware method for Mapping of CTI to Attack Techniques (HMCAT). Specifically, HMCAT first extracts Indicators of Compromise (IOC) entities in the CTI with two steps, then projects the CTI with IOC entities and the corresponding attack technique into a joint embedding space. Finally, HMCAT captures the semantics relationship among text descriptions, coarse-grained techniques, fine-grained techniques and unrelated techniques through a hierarchy-aware mapping loss. Meanwhile, we also propose a data augmentation technique based on in-context learning to solve the problem of long-tailed distribution in the Adversarial Tactics, Techniques and Common Knowledge (ATT&CK) datasets, which could further improve the performance of mapping. Experimental results demonstrate that HMCAT significantly outperforms previous ML and DL methods, improving precision, recall and F-Measure by 6.6%, 13.9% and 9.9% respectively.
Loading