Data-Efficient Generalization and Faster Initial Learning in Quantum Models for Classifying Cellular Activation States
Keywords: Quantum Machine Learning, Generalization Error, Data-Efficient Learning, Computational Biology, Quantum Neural Networks, Deep Learning
TL;DR: This paper shows that for classifying cancerous cells from cytometric data, quantum models learn faster and generalize more effectively from limited data than classical neural networks, and their performance predictably scales as theory suggests.
Abstract: Quantum computing is in its infancy. While it promises to solve some of the intractable problems of computing, real world application is scarce. It is mainly challenged by the hardware which are currently limited both in circuit width and depth. Finding a real world application with an advantage compared to classically available solutions is even harder in the current state-of-the-art machines. However, given the vastly different nature of quantum computers, it is possible the advantage may come from unexpected corners when applied to wide range of classical problems. Machine learning using quantum algorithms are of particular interest due to their ease of parameterization and possible resource efficiency. In this work, we apply a quantum machine learning (QML) algorithm to real world data and benchmark some of the well established scaling laws in a resource constraint scenario using both ideal and noisy ion trap quantum computer platform. The real world problem we investigated comes from the accurate identification of cytotoxic CD8+ T cell activation states from high‑dimensional cytometric data. Hand‑engineered features extracted from imaging flow cytometry capture morphological, intensity, texture and shape descriptors that are essential for discriminating between quiescent and stimulated cellular states. Leveraging a dataset of processed blood cell images from three patients, we compare quantum data re‑uploading classifiers (QDRCs) with classical feedforward neural networks (FNNs) for the task of binary classification of cellular activation. The study is driven by three findings: (1) both quantum and classical models achieve high test accuracy ($\approx99$%) when trained with sufficient data and epochs, and models trained on one patient generalize well to the other two, demonstrating the learnability of the engineered feature space; (2) the generalization error of QDRCs exhibits a predictable power‑law scaling with training size consistent with a $\sqrt{\frac{T}{N}}$ bound for T trainable parameters, whereas FNNs lack a comparable scaling relationship; and (3) QDRCs achieve high accuracy in early epochs under low‑data constraints, aligning with a convex kernel interpretation of the re‑uploading model. We further validate a theoretical bound derived from quantum generalization theory and provide an intuitive proof under a convexity assumption. These results indicate that quantum architectures can be competitive with classical baselines while offering faster early generalization and theory-consistent behavior in data‑limited regimes, although our conclusions are restricted to hand‑crafted features and do not imply clinical readiness or broader generalization.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 25313
Loading