Conformal Frequency Estimation with Sketched DataDownload PDF

Published: 31 Oct 2022, Last Modified: 12 Oct 2022NeurIPS 2022 AcceptReaders: Everyone
Keywords: Sketching, conformal inference, memory constraints, privacy, frequency estimation
TL;DR: A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in a very large data set, based on the information contained in a much smaller sketch of those data.
Abstract: A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data. The approach is data-adaptive and requires no knowledge of the data distribution or of the details of the sketching algorithm; instead, it constructs provably valid frequentist confidence intervals under the sole assumption of data exchangeability. Although our solution is broadly applicable, this paper focuses on applications involving the count-min sketch algorithm and a non-linear variation thereof. The performance is compared to that of frequentist and Bayesian alternatives through simulations and experiments with data sets of SARS-CoV-2 DNA sequences and classic English literature.
Supplementary Material: pdf
16 Replies

Loading