IKEA: Unsupervised domain-specific keyword-expansion

Published: 01 Jan 2022, Last Modified: 04 Mar 2025ASONAM 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: How can we expand an initial set of keywords with a target domain in mind? A possible application is to use the expanded set of words to search for specific information within the domain of interest. Here, we focus on online forums and specifically security forums. We propose IKEA, an iterative embedding-based approach to expand a set of keywords with a domain in mind. The novelty of our approach is three-fold: (a) we use two similarity expansions in the word-word and post-post spaces, (b) we use an iterative approach in each of these expansions, and (c) we provide a flexible ranking of the identified words to meet the user needs. We evaluate our method with data from three security forums that span five years of activity and the widely-used Fire benchmark. IKEA outperforms previous solutions by identifying more relevant keywords: it exhibits more than 0.82 MAP and 0.85 NDCG in a wide range of initial keyword sets. We see our approach as an essential building block in developing methods for harnessing the wealth of information available in online forums.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview