TL;DR: Design and analysis of cardinality sketches that are robust to adaptive inputs
Abstract: Cardinality sketches are compact data structures that efficiently estimate the number of distinct elements across multiple queries while minimizing storage, communication, and computational costs. However, recent research has shown that these sketches can fail under {\em adaptively chosen queries}, breaking down after approximately $\tilde{O}(k^2)$ queries, where $k$ is the sketch size.
In this work, we overcome this \emph{quadratic barrier} by designing robust estimators with fine-grained guarantees. Specifically, our constructions can handle an {\em exponential number of adaptive queries}, provided that each element participates in at most $\tilde{O}(k^2)$ queries. This effectively shifts the quadratic barrier from the total number of queries to the number of queries {\em sharing the same element}, which can be significantly smaller. Beyond cardinality sketches, our approach expands the toolkit for robust algorithm design.
Lay Summary: Many applications—from monitoring web traffic to counting unique users—rely on compact data summaries called *cardinality sketches* to estimate how many distinct items are present. These sketches save space and time, but recent work has shown that they can fail when queries are chosen based on previous answers—a situation common in adaptive systems like feedback loops or real-time monitoring.
Current methods break down after about \( k^2 \) adaptive queries, where \( k \) is the sketch size, creating a fundamental barrier.
We develop new *robust estimators* that overcome this limitation by shifting the focus from the total number of queries to how often individual items are queried. As long as each item appears in at most \( \tilde{O}(k^2) \) queries, our estimators remain accurate, even when the total number of queries is much larger.
This means the sketches can safely support far more adaptive queries in practice, especially when most items are rarely repeated.
Our approach provides theoretical guarantees and performs well in simulations, improving real-world robustness by up to 100×.
Primary Area: Social Aspects->Robustness
Keywords: Sketching, cardinality sketches, robustness to adaptive inputs, adaptive data analysis, differential privacy
Submission Number: 4151
Loading