\onecolumn

\begin{center}
    {\Large\bfseries CP$^2$: Leveraging Geometry for Conformal Prediction via Canonicalization}\\[0.5em]
    {\Large\bfseries --- Supplementary Material ---}
\end{center}

\label{appendix}

\tableofcontents

\newpage


\section{Additional Experiment Details}
\label{app:details}

\paragraph{Considerations on the canonicalization network.} It is important to stress the motivation of a light-weight, \emph{post-hoc} method when making architectural design choices about the canonicalization network (CN). As such we desire an efficient and thus smaller and potentially less expressive CN. Naturally, a small performance drop over the symmetry-unaware predictor is expected by incorporating the CN, since as a trained model it is prone to (some) pose prediction errors itself. Additionally, both the limited expressivity of the light-weight CN as well as the resolution on $SO(2)$ induces discretization artifacts. We examine the CN's miscanonicalization in \autoref{tab:cifar10-miscanonicalization}, reporting the fraction of correctly predicted group elements on CIFAR-10 data subject to $C4$ and $C8$ rotations. We additionally nuance that the miscanonicalization results do not directly translate to down-stream lower prediction accuracy, as the predictor itself can exhibit robustness to minor pose variations (\eg~correctly classifying images with rotation angles in $[-45^\circ, 45^\circ]$) and hence may still correctly predict a target despite erroneous pose alignment. 

\begin{table}[h]
  \caption{
  Fraction of group elements correctly predicted by trained canonicalization networks on CIFAR-10. We subsequently evaluate the models on a hold-out split of data under the same group effects, \ie~$C4$ or $C8$.
  }
  \centering
  \begin{tabular}{l|c}
    \toprule
     \textbf{Model} & \textbf{\% Corr. angles} \\
     \midrule
    Canonicalization with $(G=4)$ & 87.46\\
    Canonicalization with $(G=8)$ & 87.23\\     
    \bottomrule
  \end{tabular}
  \label{tab:cifar10-miscanonicalization}
\end{table}


\subsection{Robustness to Geometric Data Shifts}
\label{app:details-robust}

\paragraph{Image canonicalization network.} For the image domain, when taking into account the aforementioned desire for an efficient geometric module, we restrict ourselves to $C4$ and $C8$-equivariant canonicalization networks. We adopt the models described in \cite{mondal2023equivariant}, employing a compact 3-layer, $G$-equivariant WideResNet, where $G$-equivariance is achieved through the use of E2CNN \citep{weiler2019general}. All models are trained for a maximum of 100 epochs with early stopping, and optimized using Adam.

\paragraph{Point cloud canonicalization network.} For the continuous point cloud domain, we similarly follow the approaches outlined in \cite{mondal2023equivariant} by adopting a compact Vector Neuron model \citep{deng2021vnn}. These models are trained for 250 epochs with a cosine learning rate scheduler, and optimized using Adam.


\subsection{Diagnostics for Conditional Coverage}
\label{app:details-mcp}

\paragraph{Intuition.} We explore the hypothesis that different group elements (\eg~particular rotation angles) can correlate with specific data partitions due to their distinct geometric properties \citep{urbano2024selfsupervised, vanderlinden2025learningsymmetriesweightsharingdoubly, allingham2024generativemodelsymmetrytransformations, romero2023learningpartialequivariancesdata}. For example, isotropic shapes such as a ring or the digit “0” (\eg~in MNIST) may withstand arbitrary rotations without altering their class identity or losing significant visual features. Hence their geometric pose may be naturally uniformly distributed over the rotation group. Conversely, shapes such as the digit “6” transform into a “9” when rotated at 180$^\circ$, potentially leading to erroneous prediction. Consequently, one would not expect to observe such group elements to meaningfully contribute to the shape's natural pose distribution. Our experiments in \autoref{subsec:exp-condcover} manually induce such shifts (\eg~on class labels) in order to highlight the canonicalization network's accurate recovery of such geometric behaviour.

\paragraph{Experimental design.} We induce several group shifts conditioned on particular target partitions:
\begin{itemize}
    \item \texttt{dirac}: A dirac distribution over the group, pinpointing a single group element per partition;
    \item \texttt{normal}: A normal distribution over the group; and
    \item \texttt{var-gauss}: various Gaussian distributions with standard deviations in $[0.0001, 0.001, 0.01, 0.1, 1.0, 10.0]$.
\end{itemize}

To improve the visual recovery of partition-conditional group effects, we additionally exclude data points for which the canonicalization network's predicted group probability falls below a predefined threshold, ensuring that only samples with confident group predictions are taken into account. This aids in counteracting some of the canonicalization's erroneous predictions (see \autoref{tab:cifar10-miscanonicalization}) to better demonstrate why mondrian conformal prediction may be useful when clear patterns exist.


\subsection{Weighting for Double Shift Settings.}
\label{app:details-wcp}

\paragraph{Intuition.} The canonicalization network’s role is to mitigate the first shift between $\gD_{train}$ and $\gD_{cal}$, but in the double-shift setting of \autoref{subsec:exp-weightcp} we also encounter a subsequent shift between $\gD_{cal}$ and $\gD_{test}$ (see \autoref{tab:wcp-settings}, third row), such as from known discrete group elements in $C8$ to potentially any rotation in  $SO(2)$ (see \autoref{fig:weighting}, right). From empirical observations and prior studies, minor rotations (e.g., within $\pm 5$ degrees) have shown to enhance the accuracy of down-stream pose prediction tasks. This improvement is often attributed to the alignment with natural object variations captured in datasets \citep{mondal2023equivariant}. This insight suggests that within small deviations from known group elements, a well-trained CN is capable of accurately identifying the nearest group element. This accuracy decreases as the continuous rotation deviates further from these discretized elements, reaching maximum ambiguity at positions equidistant from two neighboring group elements (\ie~maximal shift). Therefore, when a test sample’s transformation is close to one of these discretized rotations, the CN tends to assign higher probabilities to that group element or its immediate neighbors. We harness these insights on the CN’s probabilistic output to obtain geometry-informed weights for weighted conformal prediction, enhancing robustness to rotations not explicitly covered by the discrete, known group elements.

\paragraph{Experimental design.} To navigate the transition between the (exchangeable) discrete setting and the (non-exchangeable) uniform $SO(2)$ group, we model a group distribution on the sphere. Specifically, we define it as either discrete peaks at the $C4$ or $C8$ elements, or as a continuous distribution using a mixture of `von Mises' distributions each centered at $C4$ or $C8$ elements. The `von Mises' p.d.f. is of the form $f(x \,|\, \mu, \kappa) = 1/(2\,\pi\, I_0(\kappa)) \cdot\mathrm{exp}(\kappa \mathrm{ cos}(x))$, where $\mu$ denotes a location parameter and $\kappa$ controls the concentration of mass around $\mu$. Varying $\kappa$ facilitates the interpolation between discrete $C4$ or $C8$ sampling and more uniform $SO(2)$ sampling. In \autoref{fig:weighting}, we use $\kappa=[50, 40, 30, 20, 10]$ as interpolative factors, and visualize the resulting spherical group distributions on the $x$-axis. Regarding the inverse geometric weighting relationship described in \autoref{subsec:method-usecases}, we visualize different values of the modulating parameter $p$ and their effect on the weighting distribution in \autoref{fig:weights_pow}; and their impact on the $C4$ to $SO(2)$ double-shift setting in \autoref{fig:weights_params}. For the main paper, we opt for a cross-entropy distance metric and set $p=2.0$ as it empirically displays a good trade-off between set size and coverage target.

\input{fig/fig_weight_parameters}


\section{Additional Experiment Results}
\label{app:exp}

\input{fig/fig_proxy_partition_cov}

\input{fig/fig_mcp_cov_both}

\paragraph{Robustness to geometric shift with Thr \citep{sadinle2019least}.} Accuracy and conformal results for the geometric shift experiments from \autoref{subsec:exp-robust} on both images and point clouds using another conformal scoring approach (Thr \citep{sadinle2019least}). This conformal scoring function is simply defined as $s(\vx_i, \vy_i) = 1 - \hat{p}(\rvy_i = \vy_i | \vx_i)$ for any true class label $\vy_i$. The interpretation of obtained results is consistent with those in the main paper using \emph{APS} \citep{romano2020classificationvalidadaptivecoverage}.

\input{tab/robust_shift_pointcloud_thr}
\input{tab/robust_shift_cifar10_aps}
\input{tab/robust_shift_cifar10_thr}
\input{tab/robust_shift_cifar100_thr}
