QuantileFlow: A Unified and Accelerated Quantile Sketching Framework for Anomaly Detection in Streaming Log Data

Published: 28 Jan 2026, Last Modified: 30 Mar 2026International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT)EveryoneRevisionsCC BY 4.0
Abstract: Quantile sketching enables scalable estimation of tail latencies such as the 95th and 99th percentiles without storing full streams, making it a practical foundation for anomaly detection in observability pipelines. We introduce QuantileFlow, a unified framework that standardizes ingestion, query, merge, and serialization across multiple quantile sketch families. Using the LogHub HDFS v1 dataset in a production style streaming pipeline, we process 575,059 latency events end to end and benchmark accuracy, memory footprint, throughput, and runtime under identical workloads. We also microbenchmark insertion by adding 1,000,000 log normal samples and attribute execution time to key internal routines. Across experiments, DDSketch provides the strongest throughput while preserving tail fidelity through relative error guarantees. HDR Histogram maintains stable precision across wide dynamic ranges but is more sensitive to configuration and incurs higher overhead. MomentSketch is most compact in memory and efficient for smooth unimodal data, but its quantile accuracy degrades on heavy tailed or multimodal streams. Finally, we show an optimized QuantileFlow DDSketch implementation that improves throughput by 44% over DataDog’s official implementation and 2.67 times a prior internal version.
Loading