\section{Related Work}
\label{sec:related_work}

\paragraph{Adaptive Conformal Sets.} Input-conditional coverage guarantees with finite samples are impossible without infinite width intervals \cite{vovk2012conditional,lei2014distribution}. However, an extensive line of work has focused on providing adaptive conformal sets that can capture heteroskedastic uncertainty in the model's prediction \cite{romano2019conformalized,kivaranovic2020adaptive}. Some works up-weight the non-conformity scores of calibration samples based on some distance notion to the test instance \cite{mao2022valid,guan2023localized,ghosh2023improving} or make assumptions on the data distribution \cite{lei2014distribution,barber2023conformal}. These approaches do not integrate information about the non-conformity score in the weighting process. In contrast, approaches such as \cite{han2022split,jung2022batch,amoukou2023adaptive} model some statistic of the (conditional) non-conformity score distribution to re-weight, correct or learn the quantile threshold. 

\vspace{-.13in}
\paragraph{Local Conditional Coverage.} Some works have proposed split conformal prediction methods for a predefined set of groups. For non-overlapping groups Mondrian conformal prediction provides finite sample guarantees \cite{vovk2003mondrian,vovk2012conditional}. The assumption here is that the observations in each group of the partition are exchangeable. For overlapping groups \cite{foygel2021limits} provides a conservative approach with finite sample guarantees (largest set from the groups that contain the test point). The work by \cite{jung2022batch} learns the non-conformity score threshold conditioned on each group via quantile regression. Their approach has asymptotic guarantees, while \cite{gibbs2023conformal} proposed an alternative with finite sample guarantees. 

\vspace{-.13in}
\paragraph{Group Identification for Local Conformal Prediction.} \cite{lei2014distribution} proposes a ``sandwich slicer'' approach that bins the input features before applying a group/local conditional conformal approach, while \cite{sesia2021conformal} proposes histogram binning of the ML model's output values. These approaches are simple but greedy, and do not leverage the information of the distribution of the non-conformity scores. Existing kernel-based localizers for conformal prediction \cite{guan2023localized,han2022split} do not partition the input space and do not integrate information about the non-conformity scores. The work by \cite{amoukou2023adaptive} is the closest to our approach. They propose an adaptive conformal prediction approach that learns the non-conformity score weights with a random forest that approximates a statistic of the non-conformity scores. {To achieve this objective, they use a quantile random forest that approximates the distribution of the non-conformity scores on the calibration dataset. }
Moreover, they provide an approach to approximate the forest's weights with a partition function using a graph clustering method based on Louvain-Leiden \cite{traag2019louvain} with Markov Stability \cite{delvenne2010stability}. Therefore, we compare against two of their variants. {It is important to note that the quantile random forest algorithm by \cite{meinshausen2006quantile} does not minimize (an approximation of) the quantile objective ($1-\alpha$) but instead it minimizes the inter-leaf variance of the non-conformity scores; the leaves of this QRF algorithm store the entire list of non-conformity scores of train samples falling in the leave, rather than a single summary statistic. In our formulation, the learned partition function approximates the $1-\alpha$ quantile of the non-conformity score, since we minimize pinball loss.}





% Conformal Prediction using Conditional Histograms -> binning in Y
% Mondrian -> ok but no idea how to split the groups
% Lei & Wasserman slice sandwich greedy manner...
% Gibbs 2023 finite samples guarantees group conditional with overlapping groups, JJung asymptotic guarantees both leveraging pinball loss on the nonconformity score 
%Foygel too conservative worst case interval
% amoukou the most similar one