Estimation of Entropy in Constant Space with Improved Sample Complexity

Maryam Aliakbarpour; Andrew McGregor; Jelani Nelson; Erik Waingarten

Estimation of Entropy in Constant Space with Improved Sample Complexity

Maryam Aliakbarpour, Andrew McGregor, Jelani Nelson, Erik Waingarten

Published: 31 Oct 2022, Last Modified: 11 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: Sample complexity, Data streams, Shannon Entropy

TL;DR: Estimation of Entropy in Constant Space with Improved Sample Complexity

Abstract: Recent work of Acharya et al.~(NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pm\epsilon$ additive error by streaming over $(k/\epsilon^3) \cdot \text{polylog}(1/\epsilon)$ i.i.d.\ samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/\epsilon^2)\cdot \text{polylog}(1/\epsilon)$. We conjecture that this is optimal up to $\text{polylog}(1/\epsilon)$ factors.

Supplementary Material: pdf

11 Replies

Loading