Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Consider the communication-constrained problem of nonparametric function estimation, in which each distributed terminal holds multiple i.i.d. samples. Under certain regularity assumptions, we characterize the minimax optimal rates for all regimes, and identify phase transitions of the optimal rates as the samples per terminal vary from sparse to dense. This fully solves the problem left open by previous works, whose scopes are limited to regimes with either dense samples or a single sample per terminal. To achieve the optimal rates, we design a layered estimation protocol by exploiting protocols for the parametric density estimation problem. We show the optimality of the protocol using information-theoretic methods and strong data processing inequalities, and incorporating the classic balls and bins model. The optimal rates are immediate for various special cases such as density estimation, Gaussian, binary, Poisson and heteroskedastic regression models.
Lay Summary: In many distributed learning systems, samples are collected by different devices and need to be transmitted to a central processor for training a model. The communication bandwidth for the transmission is often limited, which directly affects the accuracy of the model. The goal of our work is to mathematically characterize the trade-off between the accuracy and the bandwidth. We focus on a unified distributed nonparametric estimation framework, where the accuracy is quantified by the optimal convergence rate. Different from previous works, we allow the number of samples at each device to change freely without putting any restrictions. We determine the optimal rates in this more realistic and complex setting, hence unify and extend prior fragmented results. An interesting finding is that the optimal estimation rate undergoes sharp phase transitions, as the samples for each device vary from sparse to dense. On the technical side, we develop a novel layered estimation protocol to achieve the optimal rate, and show its optimality by incorporating tools from information theory and classic probabilistic models. The understanding of the trade-off in our work can provide guidance to the development of distributed systems and algorithms.
Primary Area: Theory->Learning Theory
Keywords: Distributed statistical estimation, nonparametric estimation, communication constraints, distributed algorithms, optimal rate of convergence.
Submission Number: 553
Loading