\section{Adaptive Quantum Conformal Prediction}

The above section motivates the need for new quantum conformal procedures that are robust to non-exchangeable scores. A substantial body of conformal prediction literature already addresses non-exchangeable feature-target data, which commonly arises in time-series settings. These works provide a natural foundation for adaptation.

Existing approaches can be characterised by the restrictions they impose on the type of shift, varying from known covariate shifts \citep{CPundercovariateshift, gibbs2025conformal} to unknown joint distribution shifts \citep{beyondexchangeability, ACI, xu2021conformal}, and by the finite-sample or asymptotic nature of their guarantees.

This section introduces the Adaptive Quantum Conformal Prediction (AQCP) algorithm, the application of the Adaptive Conformal Inference (ACI) framework \citep{ACI} to the quantum setting. We choose to modify ACI because it makes no assumptions about the form of the underlying distribution shift. The adjustment of ACI to the quantum setting requires the use of quantum-specific score functions and modified assumptions to obtain a finite-sample theoretical result (see Theorem~\ref{thm: ACI guarantee}).

An alternative approach, discussed in Appendix~\ref{app:BeyondEX}, is to examine the conformal guarantees available to quantum models under the framework of \citet{beyondexchangeability}. This framework is particularly relevant as it provides finite-sample guarantees while accommodating arbitrary distribution shift. However, it is not the primary focus here, since obtaining a tight bound within this framework would require quantifying the total variation distance caused by the distribution shift.

\subsection{AQCP Algorithm}
\label{method: AQCP guarantees}


AQCP operates in an online testing setting where the miscoverage level is dynamically adjusted according to observed empirical coverage. As a result, it assumes that the response associated with each test point is revealed before the subsequent test point is processed. 

Starting with an initial calibration set $\mathcal{D}_{\text{cal}}$ of size $n$ (see Section~\ref{sec:CP_back}), the algorithm processes test points sequentially. At the $i^\text{th}$ test point, AQCP: constructs a prediction set $C_i(X_{n+i})\subseteq\mathcal Y$ using the current miscoverage level $\alpha_i$, observes whether the true label $Y_{n+i}$ falls within this set,  updates the miscoverage level to $\alpha_{i+1}$ based on the outcome, and appends the newly observed point $(X_{n+i}, Y_{n+i})$ to the calibration set to inform future prediction sets. This feedback mechanism allows the algorithm to adapt to shifts in the score distribution without making assumptions about the nature of the shift.

Given a desired miscoverage rate \(\alpha\in[0,1]\), the update to the miscoverage level is
\begin{align}
\alpha_{1} &= \alpha,\notag\\
\alpha_{i+1} &= \alpha_{i} + \gamma \left(\alpha-\mathrm{err}_{i}\right), \quad i \in \mathbb{N}. \label{method: update}\notag
\end{align}
Here, $\gamma>0$ is a step size hyperparameter and the error function, err$_i$, is given by
\[
    \text{err}_{i} \vcentcolon=
    \begin{cases}
    1 & \text{if }Y_{n+i} \notin C_i(X_{n+i}), \\
    0 & \text{otherwise.}
    \end{cases}
\]
Each value $\alpha_{i}$ is used to construct the prediction set $C_i(X_{n+i})$ for $Y_{n+i}$. This set contains all outcomes $y$ whose score does not exceed the $(1-\alpha_{i})$-quantile of the scores from all previous observations. Formally, the prediction set is defined
\[  
    C_i(X_{n+i}) \coloneqq \left\{y \in \mathcal{Y} : \hat S\left(X_{n+i},y \,; \mathcal{A}_{X_{n+i},T_{n+i}}\right) \leq \text{Quantile}\left(1-\alpha_{i},\frac{1}{n+i}\sum_{j=1}^{n+i-1}\delta_{\hat{S}(X_j,Y_j\,; \mathcal{A}_{X_{j},T_{j}})}+\delta_{+\infty}\right)\right\}.
\]
See Algorithm~\ref{alg:AQCP_batch_predict} for the full statement of this process. The choice of step size is important: if it is too high, the coverage becomes too volatile; if it is too low, the system will not adapt fast enough to the changes in the distribution. Choosing $\gamma=0$ recovers QCP with an updating calibration set.

Notice that the adaptive parameter $\alpha_i$  is proven to remain within a bounded interval $[-\gamma,1+\gamma]$ with probability one, ensuring the stability of the method. This stability property is then used to establish a bound on the average miscoverage error over $N$ test points,
\[  
 \left| \frac{1}{N}\sum_{i=1}^{N}\text{err}_i  -\alpha \right| \leq \frac{\max\{\alpha_{1},1-\alpha_{1}\}+\gamma}{N\gamma}.
\]
This bounds the difference between the average observed miscoverage and the target miscoverage $\alpha$. In particular, since the bound decreases inversely with $N$, it follows that as the number of observations increases, the average miscoverage rate converges almost surely to the desired target $\alpha$, 
\[
    \lim_{N\to\infty}\frac{1}{N}\sum_{j=1}^{N}\text{err}_j\stackrel{\text{a.s.}}{=}\alpha.
\]
These guarantees do not depend on exchangeability. This means they are applicable with no assumptions imposed on the noise induced by the quantum hardware. The trade-off for this greater generality is that the guarantees are asymptotic in the number of test points. \citet[Theorem 4.1]{ACI} give a finite-sample guarantee under a set of specific conditions, namely that the distributional shift is fully determined by a hidden Markov model. This is not directly applicable to our setting, as there is no distributional shift in the feature-target data itself. Instead, the shift arises from the time dependence of the shots which parametrise the score function. However, with a simple adaptation to the assumptions and proof (see Appendix \ref{app:adaptiv_proof} for the latter), we can state the following theorem.
\newpage
\begin{theorem}[Finite-Sample Guarantee for AQCP, Adapted from \cite{ACI}]\label{thm: ACI guarantee}
Let $(X_i,Y_i)$ be i.i.d.,\ and suppose prediction sets are constructed using the empirical $(1-\alpha_i)$-quantile of conformity scores computed from a fixed calibration dataset $\mathcal{D}_{\mathrm{cal}}$ (Algorithm~\ref{alg:AQCP_batch_predict} without the optional step). 
Assume the test conformity scores are conditionally independent given a hidden Markov chain $(B_{i})$ with state space $\mathcal B$. Suppose that the joint process $(\alpha_k,B_{n+k})$ forms a Markov chain on $[-\gamma,1+\gamma]\times\mathcal B$ with unique stationary distribution $\pi$, and that the process is initialised at stationarity. Let $(B_i)$ have transition operator $P_B$ and stationary distribution $\pi_B$ (the marginal of $\pi$ on $\mathcal{B}$). Assume that $P_B$ has non-zero absolute spectral gap\footnote{See Definition~\ref{def:ASG} for the definition of the absolute spectral gap.} $1-\eta>0$.
Define
\[
B
\coloneqq
\sup_{b\in\mathcal B}
\big|
\mathbb E[\mathrm{err}_i \mid B_{n+i}=b] - \alpha
\big|,
\qquad
\sigma_B^2
\coloneqq
\mathbb E\left[
\big(
\mathbb E[\mathrm{err}_i \mid B_{n+i}] - \alpha
\big)^2
\right].
\]
Then for any $\varepsilon>0$,
\[
\mathbb P\left(
\left|
\frac{1}{N}\sum_{i=1}^N \mathrm{err}_i - \alpha
\right|
\ge \varepsilon
\right)
\le
2\exp\left(-\frac{N\varepsilon^2}{8}\right)
+
2\exp\left(
-\frac{N(1-\eta)\varepsilon^2}
{8(1+\eta)\sigma_B^2 + 20B\varepsilon}
\right).
\]
\end{theorem}
Although the assumptions required by this theorem are unlikely to hold exactly in our setting, we expect the bound to be broadly representative of the algorithm's empirical behaviour, as argued in \cite{ACI}.

\begin{algorithm}
\caption{Adaptive Quantum Conformal Prediction (AQCP)}
\label{alg:AQCP_batch_predict}

\SetKwInOut{Input}{Input}
\SetKwInOut{Output}{Output}
\SetKwProg{Fn}{Function}{:}{end}
\SetKwProg{Proc}{Procedure}{:}{end}

\Input{
    Miscoverage level $\alpha \in [0,1]$ \\
   \,\,Score function $\hat S$ \\
    \,\,Initial calibration dataset $\mathcal{D}_{\text{cal}} = \{(x_i,y_i)\}_{i=1}^{n}$ \\
    \,\,Test stream $\mathcal{D}_{\text{test}} = \{(x_i,y_i)\}_{i=n+1}^{n+n'}$ \\
    \,\,Number of quantum shots $M \geq 1$\\
    \,\,Step size $\gamma > 0$
}

\Output{
    A sequence of prediction sets for the test stream $\{C_{i}(x_{n+i})\}_{i=1}^{n'}$ 
}

\BlankLine
\hrulefill \\
\textbf{Core Functions}
\hrulefill
\BlankLine

\Proc{InitialCalibrate($\mathcal{D}_{\text{cal}}$)}{
    \For{$(x_i, y_i) \in \mathcal{D}_{\text{cal}}$}{
         $\mathcal{A}_{x_i,T_i}\leftarrow $M shots from PQC with input $x_i$ \;
        Add $\hat S(x_i, y_i; \mathcal{A}_{x_i,T_i})$ to $\mathcal{S}$ \;
    }
}

\Fn{GetQuantile($\mathcal{S}, \alpha$)}{
    \Return $\inf \left\{ q : \frac{1}{\lvert\mathcal{S}\rvert} \sum_{s_i \in \mathcal{S}} \mathds{1}\{s_i \leq q\} \geq 1-\alpha \right\}$ \;
}

\Fn{GeneratePredictionSet($x, \lambda,\mathcal{A}_{x,T}$)}{
    Initialise prediction set $C(x) \leftarrow \emptyset$ \;
    \ForEach{$y \in \mathcal{Y}$}{
        \If{$\hat S(x, y; \mathcal{A}_{x,T}) \leq \lambda$}{
            Add $y$ to $C(x)$ \;
        }
    }
    \Return $C(x)$ \;
}

\Proc{UpdateState($x, y, \lambda,\mathcal A_{x,T})$}{
    $s \leftarrow \hat S(x, y; \mathcal{A}_{x,T})$ \;
    \If{$s > \lambda$}{ $\text{err} \leftarrow 1$ ;} \Else{ $\text{err} \leftarrow 0$ ;}
    $\alpha \leftarrow \alpha + \gamma(\alpha_1-\text{err}  )$ \;
    Add $s$ to $\mathcal{S}$ ;\hfill $ \triangleright $ \% \text{Optional Step }\%
}
\BlankLine
\hrulefill \\
\textbf{Main Algorithm Execution}
\hrulefill
\BlankLine
$\text{PredictionSets} \leftarrow \emptyset$ \;
$\mathcal{S} \leftarrow \emptyset$ ;
\hfill $ \triangleright $ \% \text{Score set }\%



$\alpha_1 \leftarrow \alpha$ \;
InitialCalibrate($\mathcal{D}_{\text{cal}}$) \;
\For{$i = 1$ \KwTo $n'$}
{   
    $\lambda \leftarrow \text{GetQuantile}(\mathcal{S}, \alpha)$ \;
    $\mathcal{A}_{x_{n+i},T_{n+i}}\leftarrow $M shots from PQC with input $x_{n+i}$ \;
    $C_{i}(x_{n+i}) \leftarrow \text{GeneratePredictionSet}(x_{n+i}, \lambda,\mathcal{A}_{x_{n+i},T_{n+i}})$ \;
    Add $C_{i}(x_{n+i})$ to $\text{PredictionSets}$ \;
    UpdateState($x_{n+i}, y_{n+i}, \lambda, \mathcal{A}_{x_{n+i},T_{n+i}}$) \;
}
\Return PredictionSets \;
\end{algorithm}

