Querying Kernel Methods Suffices for Reconstructing their Training Data

TMLR Paper7021 Authors

14 Jan 2026 (modified: 19 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Over-parameterized models have raised concerns about their potential to memorize training data, even when achieving strong generalization. The privacy implications of such memorization are generally unclear, particularly in scenarios where only model outputs are accessible. We study this question in the context of kernel methods, and demonstrate both empirically and theoretically that querying kernel models at various points suffices to reconstruct their training data, even without access to model parameters. Our results hold for a range of kernel methods, including kernel regression, support vector machines, and kernel density estimation. Our hope is that this work can shed light on potential privacy concerns associated with such models.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Joonas_Jälkö1
Submission Number: 7021
Loading