Noise-Resilient Quantum Neural Networks via Zero-Noise Knowledge Distillation

Noise-Resilient Quantum Neural Networks via Zero-Noise Knowledge Distillation

ICLR 2026 Conference Submission20927 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Quantum neural networks, quantum noise, zero noise extrapolation, knowledge distillation

Abstract: Quantum neural networks (QNNs) show promise for learning on noisy intermediate-scale quantum (NISQ) devices, but two-qubit gate noise remains a significant barrier to practical implementation. Zero-noise extrapolation (ZNE) reduces errors by running circuits with scaled noise levels and extrapolating to the zero-noise limit, although it needs many evaluations per input and is susceptible to time-varying noise. We propose zero-noise knowledge distillation (ZNKD), a training-time technique that involves a ZNE-augmented teacher QNN supervising a compact student QNN. Variational learning is used to optimize the student's ability to duplicate the teacher's extrapolated outputs, resulting in robustness without the need for inference extrapolation. We additionally present a formal analysis that demonstrates how robustness flows from the ZNE instructor to the distilled student, with proofs regarding noise scaling, extrapolation error, and student generalization. In dynamic-noise simulations (IBM-style $T_1/T_2$, depolarizing, readout), ZNE-guided distillation lowers student MSE by $0.06$-$0.12$ ($\approx$10-20\%) across Fashion-MNIST, AG News, UCI Wine, and UrbanSound8K, keeping students within $0.02$-$0.04$ of the teacher and achieving $6{:}2$-$8{:}3$ ratio of teacher to student. ZNKD, which amortizes ZNE to training, provides an efficient way to drift-resilient QNNs on NISQ hardware without per-input folding or extrapolation.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 20927

Loading