% \documentclass{uai2023} % for initial submission
\documentclass[accepted]{uai2023} % after acceptance, for a revised
                                    % version; also before submission to
                                    % see how the non-anonymous paper
                                    % would look like

%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2023} % ptmx math instead of Computer
% Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2023} % newtx fonts (improves upon
 % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

\usepackage{url}            % simple URL typesetting
\usepackage{booktabs}       % professional-quality tables
\usepackage{amsfonts}       % blackboard math symbols
\usepackage{nicefrac}       % compact symbols for 1/2, etc.
\usepackage{microtype}      % microtypography
\usepackage{xcolor}         % colors
\usepackage{algorithm}
\usepackage{algpseudocode}
\usepackage{float}

\usepackage{graphicx}
\usepackage{caption}
\usepackage{subcaption}

\usepackage{xspace}
\usepackage{amsthm}
\usepackage{amsmath}
% \usepackage{bm}
\usepackage{enumitem}
\usepackage{booktabs}
\usepackage{multirow}
\usepackage{graphicx}
\usepackage[normalem]{ulem}
\useunder{\uline}{\ul}{}
\usepackage{float}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{balance}
\usepackage[toc,page]{appendix}

% For math	
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsfonts}

\hypersetup{
colorlinks   = true, %Colours links instead of ugly boxes
urlcolor     = blue, %Colour for external hyperlinks
linkcolor    = blue, %Colour of internal links
citecolor    = blue %Colour of citations
}

% customized commands
\DeclareMathOperator{\argmax}{\arg\max}
\DeclareMathOperator{\argmin}{\arg\min}
\DeclareMathOperator*{\minimize}{\text{minimize}}
\DeclareMathOperator*{\maximize}{\text{maximize}}
\DeclareMathOperator*{\st}{\text{subject to}}

\renewcommand{\algorithmicrequire}{\textbf{Input:}}
\renewcommand{\algorithmicensure}{\textbf{Output:}}
\newcommand{\modelname}{\textsf{pFedMeStruct}\xspace}

% for cross referencing the main text
% PLEASE ONLY USE xr IN THE SUPPLEMENTARY MATERIAL. 
% In the main paper, hard code any cross-reference to the supplementary material. 
\usepackage{xr} 
\externaldocument{uai2023-template}

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\title{Personalized Federated Domain Adaptation for Item-to-Item Recommendation\\(Supplementary Material)}

% The standard author block has changed for UAI 2023 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1]{Ziwei Fan\thanks{Corresponding Author.}}
\author[1]{Hao Ding}
\author[1]{Anoop Deoras}
\author[2]{Trong Nghia Hoang\thanks{Mentor. Co-corresponding Author.}}
% Add affiliations after the authors
% \affil[1]{%
%     University of Illinois Chicago\\
%     Illinois, Chicago, USA
% }
\affil[1]{%
    AWS AI Labs\\
    California, Santa Clara, USA
  }
\affil[2]{%
    Washington State University\\
    Washington, Pullman, USA
}
  
  \begin{document}
  
\onecolumn %% Turn this off if single column is desired for the supplement
\maketitle


\section{Notations and Overall Model Workflow}
\label{app:a}
We list all used symbols in Table~\ref{tab:notations}.

\begin{table}[!h]
\centering
\begin{tabular}{|c|l|}
\hline
Notation & Definition \\ \hline
$n$ & The number of items \\ \hline
$\mathbf{A}\in\mathbb{R}^{n\times n}$ & The adjacency matrix of item-item affinity\\ \hline
$\mathbf{D}\in\mathbb{R}^{n\times n}$ & The degree matrix of $\mathbf{A}$\\ \hline
$\mathbf{X}\in\mathbb{R}^{n\times d}$ & Feature matrix of items \\ \hline
$\mathbf{Z}\in\mathbb{R}^{n\times k}$ & Low-dimensional item embeddings \\ \hline
$\hat{\mathbf{Z}}\in\mathbb{R}^{n\times k}$ & Approximated item embeddings \\ \hline
$\mathbf{O}$ & Item-item pairs dataset\\ \hline
$\theta = (\theta_1, \theta_2)$ & GNN I2I prediction layer parameters\\ \hline
% $\phi = \left(\mathbf{W}^{(1)} \ldots \mathbf{W}^{(m)}\right)$ & GNN layer parameters \\ \hline
$\alpha = \{\alpha_a\}_{a=1}^n$ & Parameterization of prior  $p_{\alpha}(\mathbf{Z}|\mathbf{X})$ in VGAE \\ \hline
$\mathbf{m}_a$ & Mean item embeddings from $\mathrm{GNN}_{\phi_1}(\mathbf{X},\mathbf{A})$ \\ \hline
$\mathbf{v}_a$ & Variance item embeddings from $\mathrm{GNN}_{\phi_2}(\mathbf{X},\mathbf{A})$ \\ \hline
$\phi = (\phi_1, \phi_2)$ & GNN layer parameters for $\mathrm{GNN}_{\phi_1}$ and $\mathrm{GNN}_{\phi_2}$ \\ \hline
$p$ & The number of market segment \\ \hline
$\mathbf{w}_\ast$ & The global GNN parameters \\ \hline
$\boldsymbol{\theta}$ & The local GNN parameters \\ \hline
$\ell_i(\boldsymbol{\theta})$ & Local training loss \\ \hline
$\mathbf{C}_{\boldsymbol{\kappa}^i}$ & The clusters assignment weights\\ \hline
$\boldsymbol{\kappa}^i$ & The cluster embedding of cluster $i$ \\ \hline
$\mathbb{P}_{\boldsymbol{\kappa}^i}$ & Differentiable clustering operator for cluster $i$ \\ \hline
$\phi_\ast$ & Global GNN parameters\\ \hline
$\xi_\ast$ & Global GNN summerization\\ \hline
$n_{\tau}$ & Number of global updates\\ \hline
$n_r$ & Number of local updates\\ \hline 
$\lambda_w$ & Local regularization weights on GNN parameters\\ \hline 
$\lambda_s$ & Local regularization weights on summarization\\ \hline
\end{tabular}
\caption{Notation Table}
\label{tab:notations}
\end{table}


% \section{Overall Model Workflow}
% \begin{figure*}[]
% \centering
% \includegraphics[width=\textwidth]{figures/architecture/pf-gnn_architecture.pdf}
% \caption{Workflow Diagram. Each market consists of a graph encoder, which generates item embeddings $\bf{Z}$, the graph summarization process, the associated reverse operator, and a decoder for item-item relationship prediction. Each market shares and communicates $\phi$ (\textit{i.e.,} learnable parameters of the encoder) and $\xi$ (\textit{i.e.,} graph summarized structural information) with the server. Algorithm 1 denotes the server optimization and Algorithm 2 describes the client~(market) optimization. }
% \label{fig:architecture}
% \end{figure*}

% \section{Data Statistics}
% \label{app:b}
% The data\footnote{The dataset is publicly available at \url{https://xmrec.github.io/}.} statistics of Electronics domain are reported in Table~\ref{tab:1}.
% \begin{table*}[h!]
% 	\centering
% 	\begin{sc}
% 	\begin{small}
% 		\begin{tabular}{|l|l|l|l|l|l|}
% 			\hline
% 			{\bf Arabia} & \hspace{-0.5mm}{\bf China} & \hspace{-0.5mm}{\bf Australia} & \hspace{-0.5mm}{\bf Japan} & \hspace{-0.5mm}{\bf France} & \hspace{-0.5mm}{\bf Spain}\hspace{-1mm}\\
% 			\hline
% 			$328$ / $6440$ & \hspace{-0.5mm}$1303$ / $2087$ & \hspace{-0.5mm}$2390$ / $4834$ & \hspace{-0.5mm}$4003$ / $41861$ & \hspace{-0.5mm}$6068$ / $1451380$ & \hspace{-0.5mm}$6572$ / $109166$\hspace{-1mm}\\
% 			\hline
% 			{\bf Germany} & {\bf UK} & {\bf India} & {\bf Mexico} & {\bf Canada} & {\bf United-States}\hspace{-1mm}\\
% 			\hline
% 			$7507$ / $159154$ & \hspace{-0.5mm}$10329$ / $441033$ & \hspace{-0.5mm}$6574$ / $23869$ & \hspace{-0.5mm}$8507$ / $139783$ & \hspace{-0.5mm}$18604$ / $400825$ & \hspace{-0.5mm}$35939$ / $2048177$\hspace{-1mm}\\
% 			\hline
% 			%\bottomrule          
% 		\end{tabular}
% 	\end{small}
% 	\end{sc}
% 	\caption{Data statistics (no. active items / no. unique interaction pairs) across different market segments. The numbers of active items and unique item-item interaction pairs among them are reported.}\vspace{-2mm}
% 	\label{tab:1}
% \end{table*}


\section{Results with Error Bars}
\label{sec:results_with_std}
To demonstrate the confidence of our reported results in the main text, we further repeat all experiments on the best hyper-parameters settings $5$ times and report the standard deviation, as shown in Fig.~\ref{fig:Electronics_repeat} below. We omit baselines with error bars to avoid cluttering the plot. In particular, the results suggest that our empirical conclusions made in the main text are with high confidence, given that the reported deviations are relatively small. For most markets, {\bf PF-GNN} and {\bf PF-GNN$+$} achieve the best performance over all metrics. We can also observe that {\bf PF-GNN$+$} consistently outperforms all baseline models, which verifies our hypothesis that accounting for structural information is crucial to capture and to adapt domain knowledge in GNN modeling.

\begin{figure*}[!ht]
\begin{tabular}{cccc}
\centering
\hspace{-3mm}\includegraphics[width=0.25\linewidth]{figures/repeat_numbers_figures/low_resource_markets_flmodels_mrr20_Electronics_repeat.pdf} & \hspace{-4mm}\includegraphics[width=0.25\linewidth]{figures/repeat_numbers_figures/low_resource_markets_flmodels_ndcg20_Electronics_repeat.pdf}&
\hspace{-4mm}\includegraphics[width=0.25\linewidth]{figures/repeat_numbers_figures/high_resource_markets_flmodels_mrr20_Electronics_repeat.pdf} &
\hspace{-3mm}\includegraphics[width=0.25\linewidth]{figures/repeat_numbers_figures/high_resource_markets_flmodels_ndcg20_Electronics_repeat.pdf}\\
(a) & (b) & (c) & (d)
\end{tabular}
\caption{Re-plotting {\bf MRR@20} and {\bf NDCG@20} recommendation metric with reported standard deviation. The metric report is with respect to the performance of our {\bf PF-GNN} and {\bf PF-GNN$+$} algorithms, and other baselines across different market segments. All plots are best viewed with color. All results are averaged over $5$ independent runs.}%\vspace{-5mm}
\label{fig:Electronics_repeat}
\end{figure*}

\section{Results on Home $\&$ Kitchen Domain}
\label{sec:res_home}

We have also conducted more experiments on another large product domain, \emph{Home  and Kitchen} of the Cross-Market Dataset. 
The entire set of results\footnote{Note that \emph{Home and Kitchen} and \emph{Electronics} domains have different sets of markets because \emph{Home and Kitchen} has more missing markets, and we also filter out markets with no more than 100 item-item interactions.} is reported in Fig.~\ref{fig:MRR_Home}. Similar to prior observations on \emph{Electronics} domain, the results on \emph{Home and Kitchen} consistently show that {\bf PF-GNN} performs more robustly and produces better performance than both Local GNN and Federated GNN in all market segments over all metrics. These results reinforce and corroborate our earlier results in \emph{Electronics} domains, showcasing the robustness of the proposed method across different product categories.

\begin{figure*}[ht]
     \centering
     \begin{subfigure}[b]{0.25\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/federated_models_figures/low_resource_markets_flmodels_mrr20_Home.pdf}
         \caption{}
         \label{fig:low_re_baselines_mrr}
     \end{subfigure}\hfill
     \begin{subfigure}[b]{0.25\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/federated_models_figures/low_resource_markets_flmodels_ndcg20_Home.pdf}
         \caption{}
         \label{fig:mrr_baby}
     \end{subfigure}\hfill
     \begin{subfigure}[b]{0.25\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/federated_models_figures/high_resource_markets_flmodels_mrr20_Home.pdf}
         \caption{}
         \label{fig:mrr_tools}
     \end{subfigure}\hfill
     \begin{subfigure}[b]{0.25\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/federated_models_figures/high_resource_markets_flmodels_ndcg20_Home}
         \caption{}
         \label{fig:mrr_music}
     \end{subfigure}
    \vspace{-3mm}
        \caption{Comparison of the {\bf MRR@20} and {\bf NDCG@20} recommendation metric between our personalized federated domain adaptation algorithm, {\bf PF-GNN} and other baselines across different markets on the \emph{Home and Kitchen} domain. To avoid cluttering the plots, we split the performance report of all baselines into smaller plots. The first two plots collectively report the {\bf MRR@20} and {\bf NDCG@20} performances of all algorithms in the first $6$ market segments while the next two plots report for the remaining segments. All plots are best viewed with color.}
    \label{fig:MRR_Home}
\end{figure*}

% \section{Market Heterogeneity Analysis}
% \label{app:het_analysis}
% Due to the different distribution of these interactions across market segments, the locally induced (latent) item embeddings and interaction structures are substantially heterogeneous, as visually demonstrated in Fig.~\ref{fig:extra}a and Fig.~\ref{fig:extra}b below.

% \begin{figure*}[!ht]
%      \centering
%      \begin{subfigure}[b]{0.3\textwidth}
%          \centering
%          \includegraphics[width=1\textwidth]{figures/heterogeneity_analysis_figures/feature_hetero_Electronics.pdf}\vspace{-4mm}
%          \caption{}\vspace{-3mm}
%          \label{fig:hetero_feat_elect}
%      \end{subfigure}\hfill
%      \begin{subfigure}[b]{0.3\textwidth}
%          \centering
%          \includegraphics[width=1\textwidth]{figures/heterogeneity_analysis_figures/structural_hetero_Electronics.pdf}\vspace{-4mm}
%          \caption{}\vspace{-3mm}
%          \label{fig:hetero_struct_elect}
%      \end{subfigure}\hfill
%      \begin{subfigure}[b]{0.3\textwidth}
%          \centering
%          \includegraphics[width=1\textwidth]{figures/transferring_perf_figures/comparewith_indsgcn_Electronics.pdf}\vspace{-4mm}
%          \caption{}\vspace{-3mm}
%          \label{fig:indsgcn_compare_elect}
%      \end{subfigure}
% \caption{Plots of {\bf (a)} item embedding heterogenity across markets; {\bf (b)} item interaction heterogenity between across market segments; and {\bf (c)} negative effect of naive transfer of a pre-trained recommendation model on one segment to another (best view with color). The calculation of these heterogenity and (negative) naive transfer scores are detailed in Section~\ref{sec:hetero}.}\vspace{-2mm}
% \label{fig:extra}
% \end{figure*}

% \begin{figure*}[h!]
% \begin{tabular}{ccc}
% \centering
% \hspace{-3mm}\includegraphics[width=0.33\linewidth]{figures/heterogeneity_analysis_figures/feature_hetero_Electronics.pdf} & \hspace{-4mm}\includegraphics[width=0.33\linewidth]{figures/heterogeneity_analysis_figures/structural_hetero_Electronics.pdf} &
% \hspace{-4mm}\includegraphics[width=0.33\linewidth]{figures/transferring_perf_figures/comparewith_indsgcn_Electronics.pdf}
% \vspace{-2mm}\\
% %\hspace{-4mm}\includegraphics[width=0.25\linewidth]{figures/convergence_figures/pfedme_Electronics.pdf} &
% %\hspace{-3mm}\includegraphics[width=0.25\linewidth]{figures/convergence_figures/pfedmestruct_Electronics.pdf}\\
% (a) & (b) & (c)\vspace{-1mm}
% \end{tabular}
% \caption{Plots of {\bf (a)} item embedding heterogenity across markets; {\bf (b)} item interaction heterogenity between across markets; and {\bf (c)} negative effect of naive transfer of a pre-trained recommendation model on one market to another (best view with color). The calculation of these heterogenity and (negative) naive transfer scores are detailed in Section~\ref{sec:data}.}\vspace{-4mm}
% \label{fig:extra}
% \end{figure*}
% To be specific, we compute the locally induced item embeddings via optimizing Eq.~\eqref{eq:9} for each market segment. Cosine similarities between observed pairs of interacting items (e.g., items that are reported to be frequently bought together) across all segment are computed and then partitioned into discrete bins. Each segment thus induces a categorical distribution over bins, and the feature heterogeneity between two markets is set to be the Jensen-Shannon divergence between their induced distributions. This is visualized in Fig.~\ref{fig:extra}a, which shows moderate heterogeneity. 

% However, the heterogeneity becomes significantly more pronounced as we look into the interaction structure. Following the approach described in \citep{DBLP:journals/corr/abs-1805-11921}, we sample random walks along the edges of local item-interaction graphs. The sampled walks across all market segments are partitioned into clusters, enabling us to represent a segment in terms of a categorical distribution over a common space of random walks. The difference between two market segments can then be computed as the divergence between two corresponding distributions over random walks. This is visually reported in Fig.~\ref{fig:extra}b. As a result, this high degree of heterogeneity has rendered the naive transfer of a pre-trained model from one segment to another ineffective, as shown in Fig.~\ref{fig:extra}c. This motivates us to consider a personalized federated learning solution to this problem where recommendation insights across segments are harnessed, exchanged, and communicated concurrently, resulting in better recommendations (see the empirical results presented in our main text).

\section{Experiment Setup}
\label{sec:hyper_config}
All our experiments were conducted on a computing machine with $8$ V$100$ GPUs. For all GNN baselines, the GNN is parameterized with $3$ layers of Simplified Graph Convolution Network \citep{DBLP:journals/corr/abs-1902-07153} which map from an item's $768$-dimensional feature vector to a $128$-dimensional representation embedding vector. 

We perform \textbf{grid search} for important parameters of the models such as the learning rate which varies within $\{0.1, 0.01, 0.001\}$; the feature aggregation adaptation parameter $\lambda_w$ in Eq. (12) within $\{0.01, 0.1, 1, 10, 100\}$ which controls the importance of adapting feature aggregation via personalizing $\phi$; and the structure adaptation parameter $\lambda_s$ within $\{0.01, 0.1, 1, 10\}$ which moderates the relative importance of item-item interaction structure adaptation via $\xi$ -- see last term in Eq. (12). The best parameter configurations are selected based on their performance in the US market segment.

In particular, the best configuration of {\bf PF-GNN} is specified with $0.1$ for learning rate and $1000$ for $\lambda_w$. For {\bf PF-GNN$+$} which additionally involves the structure adaptation moderator $\lambda_s$, the best configuration is specified with the same learning rate but with different choices of $\lambda_w = 1$ and $\lambda_s = 0.01$. %\textcolor{blue}{Note that the best configuration is different from the best values shown in Fig.~(\ref{fig:sensitivity}) for Sensitivity Analysis. The reason is the best values shown in sensitivity analysis are based on the testing averaged performance on all markets while our best reported configuration is based on the performance of US market.}


\begin{figure}[]
% \begin{tabular}{cc}
\centering
     \begin{subfigure}[b]{0.46\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/convergence/pfedme_Electronics.pdf}
         \caption{{\bf PF-GNN}}
         \label{fig:pfedme_Electronics_converge}
     \end{subfigure}\hfill
     \begin{subfigure}[b]{0.46\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/convergence/pfedmestruct_Electronics.pdf}
         \caption{{\bf PF-GNN+}}
         \label{fig:pfedmestruct_Electronics_converge}
     \end{subfigure}
% \includegraphics[width=0.4\linewidth]{figures/convergence/pfedme_Electronics.pdf} &
% \hspace{25mm}\includegraphics[width=0.4\linewidth]{figures/convergence/pfedmestruct_Electronics.pdf}\\
% (a) {\bf PF-GNN} & \hspace{25mm}(b) {\bf PF-GNN+}
% \end{tabular}
\caption{Empirical comparison between {\bf PF-GNN} and {\bf PF-GNN+} on the \emph{Electronics} domain.}%\vspace{-5mm}
\label{fig:convergence}
\end{figure}

\section{Empirical Convergence Analysis}
\label{sec:convergence}
We show the empirical convergence analysis of both {\bf PF-GNN} and {\bf PF-GNN+} in Fig.~\ref{fig:convergence}. Both models converge as the number of global communication rounds increases, demonstrating that the bi-level optimization can minimize the loss even if heterogeneity exists across market segments. By comparing {\bf PF-GNN} and {\bf PF-GNN+}, {\bf PF-GNN} has more fluctuations than {\bf PF-GNN+}. Moreover, {\bf PF-GNN+} achieves faster convergence than {\bf PF-GNN}, which demonstrates the necessity of modeling statistical structural information in each market segment's item-item graph. 

\section{Sensitivity Analysis}
\label{app:sensitivity}
For empirical thoroughness, we also investigate the influence of our proposed GNN model parameters adaptation and the graph summary adaptation on overall performance. As formulated previously in our proposed structural optimization loss in Phase $3$, we use $\lambda_w$ to control the adaptation degree of GNN model parameters and $\lambda_s$ to moderate the adaptation degree of graph summarized structural information. The sensitivity trends of $\lambda_w$ and $\lambda_s$ are plotted in Fig.~\ref{fig:sensitivity} where we report the model performance with different values of $\lambda_w$ ($\lambda_s$) while fixing the other at $1$. We observe that with a fixed $\lambda_s = 1$, the best averaged MRR@20 (over all market segments) is achieved when $\lambda_w = 1$. Increasing or decreasing $\lambda_w$ appear to both decrease the recommendation performance via MRR@20 substantially (Fig.~\ref{fig:sensitivity}a). Likewise, we observe the same behavior for $\lambda_w$ while fixing $\lambda_s = 1$ in Fig.~\ref{fig:sensitivity}b. The peak shapes in both plots suggest that the model performance depends substantially on setting the optimal values for $\lambda_w$ and $\lambda_s$. These observations, however, do not suggest that the best configuration for $(\lambda_w,\lambda_s)$ is $(1, 1)$. Instead, their implication is under a fixed value for one parameter, over- or under-emphasizing the other to either extreme of the value range will reduce the performance. To find the optimal configuration for $(\lambda_w,\lambda_s)$, we adopt a grid search approach reported in Appendix~\ref{sec:hyper_config}.

\begin{figure}[]
% \begin{tabular}{cc}
\centering
     \begin{subfigure}[b]{0.5\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/sensitivity_figures/mrr20_over_lambdaw.pdf}\vspace{-2mm}
         \caption{Sensitivity to $\lambda_w$}\vspace{-2mm}
         \label{fig:lambdaw_sensitivity}
     \end{subfigure}\hfill
     \begin{subfigure}[b]{0.5\textwidth}
         \centering
         \includegraphics[width=1\textwidth]{figures/sensitivity_figures/mrr20_over_lambdas.pdf}\vspace{-2mm}
         \caption{Sensitivity to $\lambda_s$}\vspace{-2mm}
         \label{fig:lambdas_sensitivity}
     \end{subfigure}
% \includegraphics[width=0.4\linewidth]{figures/convergence/pfedme_Electronics.pdf} &
% \hspace{25mm}\includegraphics[width=0.4\linewidth]{figures/convergence/pfedmestruct_Electronics.pdf}\\
% (a) {\bf PF-GNN} & \hspace{25mm}(b) {\bf PF-GNN+}
% \end{tabular}
\caption{Performance Sensitivity (averaged over all markets) with respect to variation in (a) $\lambda_w$ which moderates the adaptation degree of learnable parameters $\phi$; and (b) $\lambda_s$ which regulates the adaptation degree of graph summary $\xi$.}\vspace{-6mm}
\label{fig:sensitivity}
\end{figure}


% Please add the following required packages to your document preamble:
% \usepackage{booktabs}
% \usepackage{graphicx}
\begin{table}[]
\centering
\caption{Overall Performance Comparison on top-10 ranking results.}
\label{tab:result_10}
\resizebox{\textwidth}{!}{%
\begin{tabular}{@{}ccccccccccc@{}}
\toprule
Market & Metric & Popularity & Siamese & FeatMLP & SLIM & SPE & Local GNN & F-GNN & PF-GNN & PF-GNN+ \\ \midrule
sa & MRR@10 & 0.07121 & 0.02032 & 0.07867 & 0.09213 & 0.01873 & 0.14561 & 0.13379 & 0.11706 & 0.15806 \\
sa & NDCG@10 & 0.07209 & 0.04201 & 0.08922 & 0.09715 & 0.04615 & 0.15745 & 0.15857 & 0.14597 & 0.18075 \\
cn & MRR@10 & 0.04896 & 0.05084 & 0.04918 & 0.02083 & 0.00672 & 0.09067 & 0.07899 & 0.08927 & 0.10807 \\
cn & NDCG@10 & 0.05756 & 0.06330 & 0.04844 & 0.01523 & 0.02256 & 0.09087 & 0.09542 & 0.10324 & 0.11434 \\
au & MRR@10 & 0.01451 & 0.06060 & 0.02300 & 0.00562 & 0.00923 & 0.03692 & 0.05314 & 0.06654 & 0.07477 \\
au & NDCG@10 & 0.01350 & 0.07189 & 0.02211 & 0.00527 & 0.02011 & 0.03705 & 0.05558 & 0.07105 & 0.07563 \\
jp & MRR@10 & 0.00992 & 0.03293 & 0.03654 & 0.01360 & 0.00320 & 0.04876 & 0.05916 & 0.07132 & 0.07497 \\
jp & NDCG@10 & 0.01240 & 0.04040 & 0.04130 & 0.01139 & 0.00638 & 0.05068 & 0.05957 & 0.07490 & 0.07521 \\
fr & MRR@10 & 0.01710 & 0.03959 & 0.02461 & 0.04945 & 0.00961 & 0.05625 & 0.06992 & 0.08676 & 0.09448 \\
fr & NDCG@10 & 0.01701 & 0.05525 & 0.02255 & 0.03485 & 0.01890 & 0.05431 & 0.06632 & 0.08324 & 0.08734 \\
es & MRR@10 & 0.01092 & 0.03319 & 0.01939 & 0.03555 & 0.01599 & 0.04764 & 0.05681 & 0.06777 & 0.07312 \\
es & NDCG@10 & 0.01289 & 0.04591 & 0.02042 & 0.02664 & 0.02855 & 0.04859 & 0.05728 & 0.06985 & 0.07606 \\
de & MRR@10 & 0.01258 & 0.04968 & 0.01662 & 0.04294 & 0.01053 & 0.05372 & 0.07774 & 0.09444 & 0.09654 \\
de & NDCG@10 & 0.01864 & 0.05698 & 0.01028 & 0.03027 & 0.01904 & 0.05236 & 0.07399 & 0.09087 & 0.09200 \\
uk & MRR@10 & 0.01465 & 0.02685 & 0.01911 & 0.04314 & 0.02097 & 0.05711 & 0.08305 & 0.09484 & 0.09499 \\
uk & NDCG@10 & 0.01556 & 0.03573 & 0.01888 & 0.03213 & 0.02855 & 0.06287 & 0.08268 & 0.09510 & 0.09512 \\
in & MRR@10 & 0.01269 & 0.03687 & 0.01350 & 0.00883 & 0.01455 & 0.04061 & 0.07307 & 0.08228 & 0.10007 \\
in & NDCG@10 & 0.01285 & 0.04445 & 0.01297 & 0.00583 & 0.02657 & 0.03999 & 0.06615 & 0.07736 & 0.09240 \\
mx & MRR@10 & 0.01283 & 0.02970 & 0.01443 & 0.01756 & 0.03063 & 0.03658 & 0.05520 & 0.06705 & 0.07755 \\
mx & NDCG@10 & 0.01080 & 0.03583 & 0.01299 & 0.01044 & 0.04197 & 0.03398 & 0.05089 & 0.06123 & 0.07039 \\
ca & MRR@10 & 0.00953 & 0.02444 & 0.01324 & 0.02483 & 0.03697 & 0.03697 & 0.05652 & 0.06757 & 0.07431 \\
ca & NDCG@10 & 0.00791 & 0.03232 & 0.01206 & 0.01698 & 0.05350 & 0.03662 & 0.05511 & 0.06708 & 0.07287 \\
us & MRR@10 & 0.00560 & 0.01891 & 0.01328 & 0.05168 & 0.02225 & 0.03488 & 0.04932 & 0.06500 & 0.07764 \\
us & NDCG@10 & 0.00467 & 0.02799 & 0.01196 & 0.03676 & 0.03196 & 0.03330 & 0.04685 & 0.06105 & 0.07220 \\ \bottomrule
\end{tabular}%
}
\end{table}
\section{Overall Results on Top-10}
We include more overall ranking performance results on top-10, as shown in Table~\ref{tab:result_10}. The top-10 performance results have same observations similar to ones in top-20 results, which are presented in the main text.


\section{Related Work}
\label{app:related}

\subsection{Item-to-Item Recommendation}
\label{sec:i2i_rec}
Item-to-item~(I2I) recommendation is a crucial component in recommender systems. I2I recommendation has several widely applied scenarios, including \emph{you may also like} in E-commerce homepages and \emph{because you watched} in video-streaming services. Existing I2I recommendation work adopts the item metadata and ID to infer the item embedding and proposes novel distance metrics for item-item affinity evaluation. One representative work is semi-parametric embedding~(SPE)~\citep{Li2019}, which adopts the mixture of ID embedding and item metadata to infer the item embedding. A pioneering work in this direction is SLIM~\citep{Karypis2011}, which proposes to model the item-item correlation weight matrix via collaborative filtering but does not account for item metadata or their higher-order interaction. Graph Neural Networks~(GNNs), which have demonstrated superiority in modeling high-order connectivity information in graph data, have been recently adopted to boost the performance of item recommendation. In fact, several GNN-based recommendation models have been proposed, which (most notably) include NGCF~\citep{wang2019neural} and LightGCN~\citep{he2020lightgcn,mao2021ultragcn}. However, these I2I methods assume the possibility of a centralized graph data storage which is often not practical when sharing transaction information across separate market segments is not allowed.

\subsection{Federated Learning for GNNs}
\label{sec:fed}
Federated Learning~(FL) provides new possibilities for training a global model with decentralized data privately owned by multiple clients. This is achieved via the pioneering work FedAvg of \citet{McMahan17}, which assumes all local datasets are independent and identically distributed. However, in some practical cases, this assumption is often violated when the clients collect data from heterogeneous environments. For example, in the recommendation, the item-item graphs acquired from different markets are often generated by heterogeneous preferential behaviors over a wide range of user demographics. To accommodate for this, personalized FL has been recently proposed which learns both the global model on the server and according personalized models hosted at each client node. 

Notably, Per-FedAvg~\citep{fallah2020personalized} formulates personalized FL following the model-agnostic meta-learning setup, which introduces the potential application to domain adaptation. Alternatively, pFedMe~\citep{DBLP:journals/corr/abs-2006-08848} extends FedAvg with an additional bi-level optimization regularization that moderates the deviation between each client model and the global model. Cluster FL~\citep{sattler2020clustered} proposes to apply clustering on clients so that clients in the same cluster follow similar data distribution. However, most existing personalized FL works assume a homogeneous, centralized model specification. However, this is not suitable to the context of graph-based models in item-to-item recommendation scenarios where a part of the model specification is the graph that is not fully visible to each client. In fact, each client only has access to a sub-graph of the entire item-item graph due to strict regulations concerning the storing and sharing of customer data. As such, most existing personalized FL methods cannot be applied straightforwardly to our setting.

Also, to the best of our knowledge, there have been several proposals of federated learning for GNNs with decentralized graph data in recent years, which (most notably) include GCFL+~\citep{xie2021federated} and FedGNN~\citep{wu2021fedgnn}. However, GCFL+ focuses on the graph classification task, which assumes local graphs are completed graphs instead of being fragments of a global graph (as is the case in our setting). Therefore, the proposed GCFL+ solution is applicable to GNNs that are parameterized only by the feature aggregation weights while treating the graph as the art of the input instead of part of the model specification. This does not apply to our scenario where local graphs need to be merged (without being shared explicitly) into a global graph which is part of the federated model specification. FedGNN, on the other hand, motivates the development of a federated user-centric recommender via GNN. Nonetheless, its encryption mechanism is discrete in nature and cannot be readily integrated into the gradient-based optimization framework of personalized federated learning. In our experiment, it was adapted into our baseline {~\bf F-GNN}, which ignores the encryption mechanism.

Similarly, there are also several federated learning for recommendation using local graph data from client nodes, which include (most notably) DeepRec~\citep{han2021deeprec} and MetaMF~\citep{lin2020meta}. However, like GCFL+, these works do not focus on constructing global graphs from local fragments. The proposed solutions also focus exclusively on building a common recommendation model rather than catering towards personalized models, which are specifically tailored to different user distributions that constitute different market segments. As such, these works also do not apply straightforwardly to our scenarios. With that, we believe our work on federated domain adaptation for item-item recommendation is the first that explores a potential combination between personalized FL and GNN models, which are parameterized by both (1) the graph that characterizes local interactions between feature components and (2) the combination weights that aggregate them.

\bibliography{fan_67}

\end{document}
