Detecting Complex Sensitive Information via Phrase Structure in Recursive Neural Networks

Published: 01 Jan 2018, Last Modified: 17 Dec 2024PAKDD (3) 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: State-of-the-art sensitive information detection in unstructured data relies on the frequency of co-occurrence of keywords with sensitive seed words. In practice, however, this may fail to detect more complex patterns of sensitive information. In this work, we propose learning phrase structures that separate sensitive from non-sensitive documents in recursive neural networks. Our evaluation on real data with human labeled sensitive content shows that our new approach outperforms existing keyword based strategies.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview