Differentially Private Boxplots

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Differentially Private Boxplots
Abstract: Despite the potential of differentially private data visualization to harmonize data analysis and privacy, research in this area remains underdeveloped. Boxplots are a widely popular visualization used for summarizing a dataset and for comparison of multiple datasets. Consequentially, we introduce a differentially private boxplot. We evaluate its effectiveness for displaying location, scale, skewness and tails of a given empirical distribution. In our theoretical exposition, we show that the location and scale of the boxplot are estimated with optimal sample complexity, and the skewness and tails are estimated consistently, which is not always the case for a boxplot naively constructed from a single existing differentially private quantile algorithm. As a byproduct of this exposition, we introduce several new results concerning private quantile estimation. In simulations, we show that this boxplot performs similarly to a non-private boxplot, and it outperforms the naive boxplot. Additionally, we conduct a real data analysis of Airbnb listings, which shows that comparable analysis can be achieved through differentially private boxplot visualization.
Lay Summary: Boxplots are a cornerstone of data exploration, but standard versions show statistics extracted directly from raw data—posing privacy risks in sensitive domains. We present a “differentially private boxplot” that injects minimal, controlled noise to protect individual records while still displaying the core characteristics of a dataset: its median, variability, asymmetry, and outliers. Built on private quantile‐estimation techniques, our method achieve optimal median and interquartile range and consistently reflects skewness and tail behavior. In both simulations and a real‐world Airbnb price study, our private boxplots are visually indistinguishable from the traditional plots and notably superior to naive approaches, making it easy to share insightful visual summaries without compromising confidentiality.
Link To Code: https://github.com/jairoadiazr/DPBoxplot
Primary Area: Social Aspects->Privacy
Keywords: differential privacy, data visualization, boxplots, exploratory data analysis
Submission Number: 6957
Loading