Streaming Algorithms via Precision SamplingDownload PDFOpen Website

2011 (modified: 04 Nov 2022)FOCS 2011Readers: Everyone
Abstract: A technique introduced by Indyk and Woodruff (STOC 2005) has inspired several recent advances in data-stream algorithms. We show that a number of these results follow eas- ily from the application of a single probabilistic method called Precision Sampling. Using this method, we obtain simple data- stream algorithms that maintain a randomized sketch of an input vector x = (x <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> ,x <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> ,...,x <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sub> ), which is useful for the following applications: 1) Estimating the F <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">k</sub> -moment of x, for k >; 2. 2) Estimating the ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> -norm of x, for p ϵ [1, 2], with small update time. 3) Estimating cascaded norms ℓp(ℓq) for all p,q >; 0. 4) ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> sampling, where the goal is to produce an element i with probability (approximately) |x <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> |/||x|| <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> . It extends to similarly defined ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> -sampling, for p ϵ [1, 2]. For all these applications the algorithm is essentially the same: scale the vector x entry-wise by a well-chosen random vector, and run a heavy-hitter estimation algorithm on the resulting vector. Our sketch is a linear function of x, thereby allowing general updates to the vector x. Precision Sampling itself addresses the problem of estimating a sum Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i=1</sub> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> from weak estimates of each real a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> ϵ [0,1]. More precisely, the estimator first chooses a desired precision u <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> ϵ (0,1] for each i ϵ [n], and then it receives an estimate of every a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> within additive u <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> . Its goal is to provide a good approximation to Σa <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> while keeping a tab on the "approximation cost" Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> (1/u <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> )- Here we refine previous work (Andoni, Krauthgamer, and Onak, FOCS 2010) which shows that as long as Σa <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> = Ω(1), a good multiplicative approximation can be achieved using total precision of only O(n log n).
0 Replies

Loading