Optimal Communication and Key Rate Region for Hierarchical Secure Aggregation With User Collusion

Xiang Zhang, Kai Wan, Hua Sun, Shiqiang Wang, Mingyue Ji, Giuseppe Caire

Published: 2026, Last Modified: 24 Mar 2026IEEE Trans. Inf. Theory 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Secure aggregation is concerned with the task of securely computing the sum of the inputs from multiple users by an aggregation server without letting the server know the inputs beyond their summation. It finds broad applications in distributed machine learning paradigms such as federated learning (FL) where numerous clients, each holding a proprietary dataset, periodically upload their locally trained models (abstracted as inputs) to a parameter server. The server then generates an aggregate model, typically through averaging, which is shared back with clients as the starting point for a new round of local training. To protect data security, secure aggregation protocols leverage cryptographic techniques to ensure the server gains no additional information beyond the input sum, even if it colludes with a subset of users. While the simple star client-server architecture provides insights into the fundamental utility-security trade-off in secure aggregation, it falls short of capturing the impact of network topology in practical systems. Motivated by hierarchical federated learning, we investigate the secure aggregation problem in a three-layer hierarchical network, where clustered users communicate with an aggregation server via an intermediate layer of relays. In addition to conventional server security which ensures the server learns only the input sum, we also impose relay security, requiring that the relays remain oblivious to users’ inputs. For such a hierarchical secure aggregation (HSA) problem, we characterize the optimal multifaceted trade-off between communication efficiency (measured by user-to-relay and relay-to-server communication rates) and key generation efficiency (including individual and source key rates). A core contribution of this work is the derivation of the optimal source key rate as a function of the number of relays, cluster size, and collusion level. We propose an optimal communication scheme alongside a key generation scheme utilizing a novel matrix structure called extended Vandermonde matrix that guarantees both input sum recovery and security. Moreover, we derive a tight information-theoretic converse proof to establish the optimal rate region for the HSA problem.
Loading