Osprey 🪶: A Reference Framework for Online Grooming Detection via Neural Models and Conversation Features

Published: 01 Jan 2024, Last Modified: 26 Jan 2025CIKM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Online grooming is the process of an adult initiating a sexual relationship with a minor through online conversation platforms. While neural models are developed to detect such incidents, their practical implications in real-world settings remain moot for their closed, irreproducible, and poor evaluation methodologies under the sparse distribution of grooming conversations in the training datasets, like undermining recall over precision. Furthermore, proposed models overlook characteristic features of grooming in online conversations, including the number of participants, message exchange patterns, and temporal signals, such as the elapsed times between messages. In this paper, we foremost contribute Osprey, an open-source library to support a standard pipeline and experimental details, incorporating canonical neural models and a variety of vector representation learning for conversations while accommodating new models and training datasets. Further, we incorporate conversation features into the models to improve recall while maintaining precision. Our experiments across neural baselines and vector representations of conversations demonstrated that recurrent neural models, particularly gru, on the sequence of pretrained transformer-based embeddings of messages in a conversation along with conversation features obtain state-of-the-art performance, winning the best recall with competitive precision. Osprey is available at https://github.com/fani-lab/Osprey/tree/cikm24.
Loading