Keywords: privacy, interpretability, logistic regression, tensor trains, clinical prediction
TL;DR: Logistic Regression models in clinical prediction are vulnerable to privacy attacks, as shown on the public LORIS model. We propose Tensor Train models, preserving accuracy and interpretability while improvig privacy protection.
Abstract: Machine learning in clinical settings must balance predictive accuracy, interpretability, and privacy. While models like logistic regression (LR) are valued for transparency, they remain vulnerable to privacy attacks that expose training data. We empirically assess these risks by designing attacks that identify which public datasets were used to train a model under varying levels of adversarial access, applying them to LORIS, a publicly available LR model for immunotherapy response prediction. Our findings show that LORIS leaks significant training-set information, especially under white-box access, and that common practices such as cross-validation exacerbate these risks. Even black-box access via the public web interface allows training data identification. To mitigate these vulnerabilities, we propose a quantum-inspired defense using tensor train (TT) models. Tensorizing LR obfuscates parameters while preserving accuracy, reducing white-box attacks to random guessing and degrading black-box attacks comparably to Differential Privacy. TT models retain LR interpretability and extend it through efficient computation of marginal and conditional distributions. Although demonstrated on LORIS, our approach generalizes broadly, positioning TT models as a practical foundation for private, interpretable, and effective clinical prediction.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 18877
Loading