Optimal Communication Bounds for Classic Functions in the Coordinator Model and Beyond

Hossein Esfandiari; Praneeth Kacham; Vahab Mirrokni; David P. Woodruff; Peilin Zhong

Optimal Communication Bounds for Classic Functions in the Coordinator Model and Beyond

Hossein Esfandiari, Praneeth Kacham, Vahab Mirrokni, David P. Woodruff, Peilin Zhong

Published: 01 Jan 2024, Last Modified: 30 Sept 2024STOC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the coordinator model of communication with s servers, given an arbitrary non-negative function f, we study the problem of approximating the sum ∑i ∈ [n]f(xi) up to a 1 ± ε factor. Here the vector x ∈ ℝn is defined to be x = x(1) + ⋯ + x(s), where x(j) ≥ 0 denotes the non-negative vector held by the j-th server. A special case of the problem is when f(x) = xk which corresponds to the well-studied problem of Fk moment estimation in the distributed communication model. We introduce a new parameter cf[s] which captures the communication complexity of approximating ∑i∈ [n] f(xi) and for a broad class of functions f which includes f(x) = xk for k ≥ 2 and other robust functions such as the Huber loss function, we give a two round protocol that uses total communication cf[s]/ε2 bits, up to polylogarithmic factors. For this broad class of functions, our result improves upon the communication bounds achieved by Kannan, Vempala, and Woodruff (COLT 2014) and Woodruff and Zhang (STOC 2012), obtaining the optimal communication up to polylogarithmic factors in the minimum number of rounds. We show that our protocol can also be used for approximating higher-order correlations. Our results are part of a broad framework for optimally sampling from a joint distribution in terms of the marginal distributions held on individual servers. Apart from the coordinator model, algorithms for other graph topologies in which each node is a server have been extensively studied. We argue that directly lifting protocols from the coordinator model to other graph topologies will require some nodes in the graph to send a lot of communication. Hence, a natural question is the type of problems that can be efficiently solved in general graph topologies. We address this question by giving communication efficient protocols in the so-called personalized CONGEST model for solving linear regression and low rank approximation by designing composable sketches. Our sketch construction may be of independent interest and can implement any importance sampling procedure that has a monotonicity property.

Loading