\section{Method} \label{sec:method}

\subsection{Partially Exchangeable Non-Conformity Scores}
\label{sec:method1}

To deploy Conformal Prediction for federated graph-structured data under a transductive learning setting, we need to ensure the exchangeability condition is met. We adopt the principle of partial exchangeability, as proposed by \citet{de1980condition} and applied to non-graph-based models by \citet{lu2023federated}. Specifically, we demonstrate that non-conformity scores within each client are permutation invariant when using a permutation-invariant GNN model for training under the transductive setting.

Consider a graph $\mathcal{G}^k = (\mathcal{V}^k, \mathcal{E}^k)$ at client $k$, where $\mathcal{V}^k$ denotes the set of nodes, $\mathcal{E}^k$ the set of edges, and each node $v \in \mathcal{V}^k$ has a feature vector $x_v \in \mathbb{R}^d$. The dataset includes distinct node subsets for training, validation, calibration, and testing: $\mathcal{V}^k_{\text{train}}$, $\mathcal{V}^k_{\text{valid}}$, $\mathcal{V}^k_{\text{calib}}$, and $\mathcal{V}^k_{\text{test}}$, respectively.

\begin{assumption}
\label{assumption:permutation-invariance}
Let $S$ be a global non-conformity score function learned in a federated setting, designed to be permutation invariant with respect to the calibration and test nodes within each client. For any permutation $\pi_k$ of client $k$'s calibration and test nodes, the non-conformity scores satisfy:
\[
\{ S(x_v, y_v) : v \in \mathcal{V}^k_{\text{calib}} \cup \mathcal{V}^k_{\text{test}} \} = 
\]
\[\{ S(x_{\pi_k(v)}, y_{\pi_k(v)}) : v \in \mathcal{V}^k_{\text{calib}} \cup \mathcal{V}^k_{\text{test}} \}.
\]
\end{assumption}

Non-conformity scores obtained through GNN training satisfy the above assumption because chosen GNN models are inherently permutation invariant with respect to node ordering. Each local GNN model accesses all node features during training and optimizes the objective function based solely on the training and validation nodes, which remain unchanged under permutation of the calibration and test nodes. Under Assumption~\ref{assumption:permutation-invariance}, we establish the following lemma.

\begin{lemma}
\label{lemma:invariance}
Within the transductive learning setting, assuming permutation invariance in graph learning over the unordered graph \(\mathcal{G}^k = (\mathcal{V}^k, \mathcal{E}^k)\), the set of non-conformity scores \(\{s_v\}_{v \in \mathcal{V}^k_{\text{calib}} \cup \mathcal{V}^k_{\text{test}}}\) is invariant under permutations of the calibration and test nodes.
\end{lemma}

The proof of Lemma~\ref{lemma:invariance} is provided in Appendix. Lemma~\ref{lemma:invariance} establishes the intra-client exchangeability of calibration and test samples for transductive node classification. Using Lemma~\ref{lemma:invariance}, we extend the concept of partial exchangeability to federated graph learning.

Assume that the subgraph at client $k$, $\mathcal{G}^k$, is sampled from a distribution $P_k$. During inference, a random test node $v_{\text{test}}$, with features and label $(x_{v_{\text{test}}}, y_{v_{\text{test}}})$, is assumed to be sampled from a global distribution $Q_{\text{test}}$, which is a mixture of the client subgraph distributions according to a probability vector $p$:
\[
Q_{\text{test}} = \sum_{k=1}^K p_k P_k,
\]
which essentially states that $v_{\text{test}}$ belongs to client $k$ with probability $p_k$.

\begin{definition}[Partial Exchangeability]
\label{def:partial-exchangeability}
Partial exchangeability in the context of federated learning assumes that the non-conformity scores between a test node and the calibration nodes within the same client are exchangeable, but this exchangeability does not necessarily extend to nodes from different clients.
\end{definition}

\begin{assumption}
\label{assumption:partial-exchangeability}
Consider a calibration set $\{v_i\}_{i=1}^{n_k}$ in client $k$ and a test node $v_{\text{test}}$ in the same client. Under the framework of partial exchangeability (Definition~\ref{def:partial-exchangeability}), the non-conformity scores \(s_{v_{\text{test}}}\) and \(\{s_{v_i}\}_{i=1}^{n_k}\) are assumed to be exchangeable with probability $p_k$, consistent with Assumption~\ref{assumption:permutation-invariance}. Therefore, $v_{\text{test}}$ is partially exchangeable with all calibration nodes within client $k$.
\end{assumption}

This assumption is justified by the properties of our non-conformity score function \(S\), which, as established under Assumption~\ref{assumption:permutation-invariance}, is designed to be permutation invariant within each client's data. This property supports the hypothesis that within a client, the test node and calibration nodes can be considered exchangeable in terms of their non-conformity scores. The limitation to within-client exchangeability is due to potential differences in data distribution across different clients, which Assumption~\ref{assumption:permutation-invariance} does not necessarily overcome. This limitation modifies the upper bound of the coverage guarantee, as elucidated in Theorem~\ref{theorem:coverage}. Details of this assumption can be found in Appendix \ref{appendix:assumption2}.

\begin{theorem}
\label{theorem:coverage}
Suppose the graph is partitioned across \( K \) clients (i.e., \( K \) denotes the number of clients in the federated setting), with each client \( k \in [K] \) having \( n_k \) calibration nodes. Let $N = \sum_{k=1}^K n_k$ and assume $p_k = (n_k + 1)/(N + K)$. If the non-conformity scores are arranged in non-decreasing order as $\{ S_{(1)}, S_{(2)}, \dots, S_{(N+K)} \}$, then the $\alpha$-quantile, $\hat{q}_{\alpha}$, is the $\lceil (1-\alpha)(N+K) \rceil$-th smallest value in this set. Consequently, the prediction set
\[
C_{\alpha}(v_{\text{test}}) = \{ y \in \mathcal{Y} \mid S(x_{v_{\text{test}}}, y) \leq \hat{q}_{\alpha} \}
\]
is a valid conformal predictor where:
\[
1 - \alpha \leq P\left( y_{\text{test}} \in C_{\alpha}(x_{v_{\text{test}}}) \right) \leq 1 - \alpha + \frac{K}{N+K}.
\]
\end{theorem}

This theorem ensures that our method achieves at least $(1 - \alpha)$ marginal coverage. The proof is provided in Appendix \ref{appendix:theorem1}.


\subsection{Generating Representative Node Features with Variational Autoencoders}\label{sec:method2}

\begin{figure*}[htbp]
    \centering
        \includegraphics[width=0.73\linewidth]{figures/gen_framework.png}
    \caption{\textbf{Missing neighbor generation framework.} 
    \textit{(i) Feature Prototype Learning:} We train \texttt{VAE}s on local subgraph features and apply K-means clustering to obtain prototype node features. The cluster centers serve as feature prototypes, which are sent to the central server for later broadcasting. 
    \textit{(ii) Collaborative Training of \texttt{VGAE}:} We train \texttt{VGAE} models in a federated manner to learn generalizable connectivity patterns across client subgraphs. 
    \textit{(iii) Missing Neighbor Completion:} The central server broadcasts the learned feature prototypes, which are then used to complete missing neighbors via the trained \texttt{VGAE} model.}
 \label{fig-framework}
\end{figure*}


To mitigate the issue of missing neighbor information in federated graph learning, we introduce a novel approach that utilizes VAEs to generate representative node features within each client. These generated features are shared with the central server and then broadcast across clients to complete the local subgraphs, thereby addressing the problem of missing links.

Each client \( k \) trains a \texttt{VAE} on its local node features \( \{ x_v \}_{v \in \mathcal{V}^k_{\text{train}}} \subset \mathbb{R}^d \), aiming to capture the underlying distribution \( P_k \) of its data. The \texttt{VAE} consists of an encoder \( q_{\phi_k}(z|x) \) and a decoder \( p_{\theta_k}(x|z) \), where \( z \in \mathbb{R}^{d'} \) is the latent representation, with \( d' < d \). The \texttt{VAE} is trained by maximizing the ELBO given in Section \ref{sec:vae}.

After training, each client generates reconstructed node features by passing its original node features through the encoder and decoder:

\[z_v = q_{\phi_k}(x_v), \quad \tilde{x}_v = p_{\theta_k}(z_v), \quad \forall v \in \mathcal{V}^k_{\text{train}}.
\]

Next, K-Means clustering \citep{kodinariya2013review} is applied to the reconstructed node features \( \{ \tilde{x}_v \} \) to identify \( M_k \) cluster centers \( \{ c_m^k \}_{m=1}^{M_k} \subset \mathbb{R}^d \):
\[
c_m^k = \frac{1}{|C_m^k|} \sum_{\tilde{x}_v \in C_m^k} \tilde{x}_v,
\]
where \( C_m^k \) is the set of reconstructed node features assigned to cluster \( m \) in client \( k \). The number of clusters \( M_k \) is determined experimentally through hyperparameter tuning.

The cluster centers \( \{ c_m^k \} \) are then used as prototype features and shared with the central server. The server aggregates the prototype features from all clients and broadcasts them back to each client. This process allows clients to augment their local subgraphs with representative node features from other clients, effectively approximating the missing neighbor information.

\subsection{Link Prediction with VGAE for Missing Neighbor Completion}
\label{sec:method3}

After the generated node features are collected by the central server and broadcast to the clients, we need to predict possible edge formations between the generated nodes and the client subgraphs. To this end, we employ a Variational Graph Autoencoder, effective in graph reconstruction tasks, suitable for our graph completion problem. The \texttt{VGAE} model is trained to maximize the ELBO loss.

To ensure that our link prediction model generalizes well across all client subgraphs, we train the \texttt{VGAE} in a federated setting using the \texttt{FedAvg} \citep{sun2022decentralized} algorithm. Different client subgraphs may have varying connectivity patterns; thus, the model needs to generalize to diverse subgraphs.

After training, the \texttt{VGAE} model is used for link prediction between generated nodes \(\hat{X}\) and local subgraph nodes \(X^k\). For each client \(k\), the link prediction process is as follows:

\begin{enumerate}
    \item \textbf{Compute edge probabilities} between generated nodes and local nodes:
    \(
    \hat{P}^k = \text{\texttt{VGAE}}(\hat{X}, X^k).
    \)
    \item \textbf{Select the top \(p\%\) of edge probabilities} to form new edges:
    \(
    \mathcal{E}^k \coloneqq \mathcal{E}^k \cup \left\{ (u,v) \mid (u,v) \in \text{Top}_p(\hat{P}^k) \right\}.
    \)
    \item \textbf{Update the node set and features}:
    \(
    \mathcal{V}^k \coloneqq \mathcal{V}^k \cup \hat{\mathcal{V}}, \quad X^k \coloneqq X^k \cup \hat{X}.
    \)
\end{enumerate}

Here, \(\text{Top}_p(\hat{P}^k)\) denotes the set of edges corresponding to the highest \(p\%\) of predicted edge probabilities in \(\hat{P}^k\). By integrating these new edges and nodes into their local subgraphs, clients enhance their models with previously missing neighbor information. This process is summarized in the Algorithm provided in Appendix \ref{appendix:algo}.


Our complete pipeline, which combines generative reconstruction with federated conformal prediction, can be summarized in the following steps:
\begin{enumerate}
    \item \textbf{Local Prototype Generation:} Each client trains a local VAE on its node features to extract and cluster representative feature prototypes.
    \item \textbf{Server Aggregation:} Cluster centers (prototypes) are sent to the server, which aggregates them and broadcasts the global set of prototypes back to all clients.
    \item \textbf{Collaborative Link Prediction:} A VGAE is trained via FedAvg to learn generalizable connectivity patterns.
    \item \textbf{Local Subgraph Completion:} Each client uses the global prototypes and the trained VGAE to predict and add missing edges to its local subgraph.
    \item \textbf{Federated GCN Training:} A GCN model is trained for node classification via FedAvg on the newly completed subgraphs.
    \item \textbf{Federated Conformal Prediction:} Clients use the global GCN and a held-out calibration set to compute non-conformity scores and generate prediction sets with a distributed quantile estimation.
\end{enumerate}