Abstract: Messaging platforms deliver billions of messages daily to a global audience, among which numerous fraudulent attempts occur, including smishing, the oldest and most frequent phishing attack, which focuses on short-message services. Previous approaches have primarily focused on continuously adapting supervised on-device models, which requires intensive and ongoing fine-tuning on massive datasets. However, little attention has been given to separating messages into campaigns, even though a significant amount of messages overlap across these campaigns. In this work, we propose Smish-Checker, a threefold, zero-training end-to-end framework designed to accelerate threat detection on bulk messaging platforms. The threefold zero-training approach comprises a three-step model that addresses smishing detection without requiring supervised training by (a) grouping messages into campaigns, (b) labeling key campaigns using the in-context learning capabilities of large language models, and (c) propagating these proposed labels to categorize unlabeled samples. Our experiments utilized a real-world dataset sourced from a leading global messaging platform. The results highlight the effectiveness of our proposed solution, indicating that mapping campaign behavior can greatly improve real-time detection capabilities. Besides that, we evaluate the limitations and strengths of LLMs to assist in forensic reporting, which is a critical facet in smishing detection investigations.
External IDs:dblp:conf/wacv/SchwarzF025
Loading