\section{Related Work}\label{sec:related_work}

\paragraph{Conformal Prediction}

In CP~\citep{vovk1999machine, vovk2005algorithmic, papadopoulos2008inductive}, the main objective is to generate a set of predictions that will likely include the correct label, an approach known as \textit{marginal coverage}. 
This is achievable under the assumption that the training and test examples are exchangeably distributed, as is the case for identical and independent distributed (iid) data.
Nevertheless, this assumption does not hold in general in cases of \textit{distribution shift}, where the test data's distribution differs from the training data. 
In such scenarios, previous research~\citep{tibshirani2019conformal, cauchois2020robust, gibbs2021adaptive, park2022pac} offers a solution, maintaining marginal coverage by adjusting for the differences in likelihood between training and test datasets.
Despite these advancements, there remains a lack of reliable assurances regarding the robustness of the predictions.

\paragraph{Randomized Smoothing}

RS is a technique used to enhance the robustness of machine learning models by introducing random Gaussian noise to the input data and aggregating the predicted outputs~\citep{cohen2019certified, salman2019provably}.
The prediction is robust against all disturbances within a norm-ball of radius $R$.
Subsequent studies have expanded the robustness guarantees of RS to include semantic and real-world transformations~\citep{fischer2020certified, li2021tss, hao2022gsmooth}.
Additionally, \cite{yoon2022robust} extended its application to time series data, demonstrating its effectiveness in temporal shifting the input data.

Recently, \cite{gendler2021adversarially} combined RS with CP, resulting in a better and robust coverage against adversarial input manipulations, denoted as RSCP.
However, the strategy for addressing extreme-case Gaussian perturbations can diminish the baseline performance (assessment using only unperturbed inputs) of the CP method. 
For example, the size of the prediction set might be excessively large even for unperturbed and straightforward-to-classify inputs. 
To overcome these challenges, \cite{pmlr-v216-ghosh23a} proposed an adjustment on the construction of the prediction set based on a two thresholds framework. 
Even if the results surpassed RSCP in performance (coverage and set-size) the guarantees are only probabilistic in practice and the approach is limited to transformations that follow a normal distribution.
% General perturbation (not only jityter)