Pontus: A Memory-Efficient and High-Accuracy Approach for Persistence-Based Item Lookup in High-Velocity Data Streams
Track: Web mining and content analysis
Keywords: Data stream processing, persistent item lookup, probabilistic data structure
Abstract: In today's web-scale, data-driven environments, real-time detection of persistent items that consistently recur over time is essential for maintaining system integrity, reliability, and security. Persistent items often signal critical anomalies, such as stealthy DDoS and botnet attacks in web infrastructures. Although various methods exist for identifying such items as well as for determining their frequency, they require recording every item for processing, which is impractical at very high data rates achieved by modern data streams. In this paper, we introduce Pontus, a novel approach that uses an approximate data structure (sketch) specifically designed for the efficient and accurate detection of persistent items. Our method not only achieves fast and precise lookup but is also flexible, allowing for minor modifications to accommodate other types of persistence-based item detection tasks, such as detecting persistent items with low frequency. We rigorously validate our approach through formal methods, offering detailed proofs of time/space complexity and error bounds to demonstrate its theoretical soundness. Our extensive trace-driven evaluations across various persistence-based tasks further demonstrate Pontus's effectiveness in significantly improving detection accuracy and enhancing processing speed compared to existing approaches. We implement Pontus in an experimental platform with industry-grade Intel Tofino switches and demonstrate the practical feasibility of our approach in a real-world memory-constrained environment.
Submission Number: 1916
Loading