Active Distribution Learning from Indirect SamplesDownload PDFOpen Website

2018 (modified: 17 May 2023)Allerton 2018Readers: Everyone
Abstract: This paper studies the problem of learning the probability distribution Px of a discrete random variable X using indirect and sequential samples. At each time step, we choose one of the possible K functions, g <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> , ... , g <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</sub> and observe the corresponding sample g <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> (X). The goal is to estimate the probability distribution of X by using a minimum number of such sequential samples. This problem has several real-world applications including inference under non-precise information and privacy-preserving statistical estimation. We establish necessary and sufficient conditions on the functions g <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> , ... , g <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">K</sub> under which asymptotically consistent estimation is possible. We also derive lower bounds on the estimation error as a function of total samples and show that it is order-wise achievable. Leveraging these results, we propose an iterative algorithm that i) chooses the function to observe at each step based on past observations; and ii) combines the obtained samples to estimate p <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</sub> . The performance of this algorithm is investigated numerically under various scenarios, and shown to outperform baseline approaches.
0 Replies

Loading