% \documentclass{uai2023} % for initial submission
\documentclass[accepted]{uai2023} % after acceptance, for a revised
                                    % version; also before submission to
                                    % see how the non-anonymous paper
                                    % would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2023} % ptmx math instead of Computer
                                         % Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2023} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent

\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)
%*************************
% 自己加的宏包
\usepackage{algorithm}
\usepackage{algpseudocode}
\usepackage{amsmath}
\usepackage{stfloats}
\usepackage{color}
\usepackage{graphicx}
\usepackage{subfigure}
\usepackage{longtable}
\usepackage{makecell}
\usepackage{enumitem}
\usepackage{amssymb}
\usepackage{multirow}
\usepackage{svg} 
\usepackage{amsthm}
\usepackage{diagbox}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{definition}{Definition}

%********************************


%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\title{Practical Privacy-Preserving Gaussian Process Regression via Secret Sharing}

% The standard author block has changed for UAI 2023 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1,2]{Jinglong Luo}
\author[2]{Yehong Zhang\thanks{Corresponding author}}
\author[2]{Jiaqi Zhang}
\author[2]{Shuang Qin}
\author[2]{Hui Wang}
\author[2]{Yue Yu}
\author[1,2]{Zenglin Xu$^*$}
% Add affiliations after the authors
\affil[1]{%
    Harbin Institute of Technology, Shenzhen, China
}
\affil[2]{%
    Peng Cheng Laboratory, Shenzhen, China
}
% \affil[*]{%
%     Correspondence to: Yehong Zhang $<$zhangyh02@pcl.ac.cn$>$ and Zenglin Xu $<$ $>$
% }


\begin{document}
\maketitle

\begin{abstract}
  \emph{Gaussian process regression} (GPR) is a non-parametric model that has been used in many real-world applications that involve sensitive personal data (e.g., healthcare, finance, etc.) from multiple data owners.
%
To fully and securely exploit the value of different data sources, this paper proposes a privacy-preserving GPR method based on \emph{secret sharing} (SS), a \emph{secure multi-party computation} (SMPC) technique.
In contrast to existing studies that protect the data privacy of GPR via homomorphic encryption, differential privacy, or federated learning, our proposed method is more practical and can be used to preserve the data privacy of both the model inputs and outputs for various data-sharing scenarios (e.g., horizontally/vertically-partitioned data).
%
However, it is non-trivial to directly apply SS on the  conventional GPR algorithm, as it includes some operations whose accuracy and/or efficiency have not been well-enhanced in the current SMPC protocol. To address this issue, we derive a new SS-based exponentiation operation through the idea of ``confusion-correction'' and construct an SS-based matrix inversion algorithm based on Cholesky decomposition.
%
More importantly, we theoretically analyze the communication cost and the security of the proposed SS-based operations. 
Empirical results
%on two real-world datasets
show that our proposed method can achieve reasonable accuracy and efficiency under the premise of preserving data privacy.
\end{abstract}

\section{Introduction}\label{sec:intro}
% 第1段：先1-2句话介绍GP是什么-->介绍GP中需要用到多方数据的场景（举2-3个例子，最好跟我们实验部分用到的数据能对应）-->讨论这些场景下的隐私保护问题
\emph{Gaussian process regression} (GPR) \citep{Rasmussen2006,gp1,gp3,gp2,zhang2016near,ZheXCQP15} is a Bayesian non-parametric model that has been widely used in various real-world applications such as disease progression prediction \citep{ortmann2019automated,shashikant2021gaussian}, traffic prediction \citep{chen2015gaussian}, and finance \citep{yang2015gaussian}, etc.
%
In practice, the data of the above applications may belong to different parties and cannot be shared directly due to the increasing privacy concerns in the \emph{machine learning} (ML) community.
% \cite{TODO}.
%
%Gaussian Process (GP) is a commonly used supervised machine learning method designed to solve regression problems and probabilistic classification problems. To achieve a high-performance GP model, a large amount of high-quality training data is required. However, the training data tend to show up in the form of “isolated data island”, because training data not only are a kind of valuable asset of data holders, but also carry sensitive personal information whose transmission and exploitation should comply with certain privacy and confidentiality terms in laws.
%
For example, two hospitals that have a small amount of patient data would like to jointly construct a high-quality GPR model for better disease progression prediction. However, such data usually contain patients' personal information and cannot be shared between hospitals due to legal regulations. In addition, when other hospitals or patients consider to use the constructed GPR model for diagnosis, privacy leakage of the personal feature (i.e., test input) and the diagnostic result (i.e., model output) is also a concern. 
%
Similarly, in finance, a bank that owns users' financial behaviors (e.g., income, credit, etc.) may hope to build a GPR model for risk control prediction by exploiting the users' consuming behaviors from the e-commerce companies.
%banks only have the user's attributes, income and expenditure behavior, credit behavior and other financial behavior characteristics, while the e-commerces have the user's consumption behavior characteristics. Banks hope to build a high-quality GPR model by using the characteristics of user consumption behavior possessed by e-commerce to improve their risk control capabilities.
Obviously, neither the financial nor the consuming behaviors of the users should be shared directly due to their high information privacy.
%These characteristics of consumption behavior are also difficult to share directly due to commercial interests and legal supervision. In addition, when using the constructed GPR model for risk prediction, user personal characteristics and the risk prediction result are also expected to be protected.

The need of information sharing in the above examples has motivated the development of a practical GPR model that can preserve the data privacy of both the model inputs and outputs in three data-sharing scenarios (Fig.~\ref{fig:datashare}): (a) \emph{Horizontal data-sharing} (HDS): Each party has a set of data for different entities with the same features and share them for model construction;
%the data used for constructing the GPR model is example-partitioned among multiple data holders such that each data holder has a set of data with the same feature;
(b) \emph{Vertical data-sharing} (VDS): Each party has different features of the same set of entities and shares them for model construction;
%the data used for constructing the GPR model is feature-partitioned among multiple data holders such that each party has different feature sets of the same set of entities.
and (c) \emph{Prediction data-sharing} (PDS): A party who aims to use the constructed model needs to share his data with the model holder for prediction.
%, which is the focus of this work.
%
At present, a few privacy enhancement techniques such as \emph{fully homomorphic encryption} (FHE)~\citep{gentry2009fully},
%,brakerski2014leveled, cheon2017homomorphic}, 
\emph{federated learning} (FL)~\citep{konevcny2016federated},
%li2020federated,yue2021federated},
and \emph{differential privacy} (DP)~\citep{dwork2006differential, abadi2016deep} have been exploited for avoiding privacy leakage in GPR.
%
However, none of them is general enough to achieve privacy-preserving GPR for all three data-sharing scenarios.
%Specifically, FHE-GPR \citep{fenner2020privacy} only considers the PDS scheme. The FL-GPR works \citep{dai2020federated,dai2021differentially,kontoudis2022fully,yue2021federated} focus on HDS scheme. 
Specifically, the FHE-GPR \citep{fenner2020privacy} and FL-GPR \citep{dai2020federated,dai2021differentially,kontoudis2022fully,yue2021federated} approaches only focus on PDS and HDS scenarios, respectively. 
The DP-GPR methods \citep{kharkovskii2020private,smith2018differentially} assume all the data belong to a single party and can only protect the privacy of either the input features \citep{kharkovskii2020private} or the outputs \citep{smith2018differentially}.
See Section~\ref{sec:related} for detailed discussions.
%Detailed discussions of these approaches are in Section~\ref{sec:related}.

\begin{figure*}[t]
	\centering
	\begin{tabular}{ccc}
		\hspace{-2mm}\includegraphics[width=0.3\textwidth]{fig/HDS (1).pdf} & \hspace{10mm}\includegraphics[width=0.25\textwidth]{fig/VDS (1).pdf} & \hspace{10mm}\includegraphics[width=0.18\textwidth]{fig/PDS (1).pdf}\\
		\hspace{-2mm}(a) Horizontal data-sharing & \hspace{10mm}(b) Vertical data-sharing & \hspace{7mm}(c) Prediction data-sharing\\
	\end{tabular}
	\caption{Diagrams of different data-sharing scenarios.}
	\label{fig:datashare}
\end{figure*}

% \begin{figure}[t]
% \centering 
% \subfigure[HDS]{
% \label{Fig.sub.1}
% \includegraphics[width=0.35\textwidth,valign=c]{HDS.pdf}}
% \subfigure[VDS]{
% \label{Fig.sub.2}
% \includegraphics[width=0.3\textwidth,valign=c]{VDS-2.pdf}}
% \subfigure[PDS]{
% \label{Fig.sub.3}
% \includegraphics[width=0.23\textwidth,valign=c]{PDS.pdf}}
% \caption{data-sharing Scenarios}
% \end{figure}\label{fig:datashare}

% 第2段：为了解决上述问题，可以采用哪些方法，介绍已有GP工作及其问题
%At present, researchers try to use privacy enhancement techniques such as Differential-privacy(DP)\citep{DP} and Homomorphic Encryption(HE)\citep{HE} to solve the problem of privacy leakage in Gaussian processes. Smith et al. \cite{DP-GP} uses differential privacy technology to achieve the protection of Gaussian process data labels, but does not consider the privacy leakage problem in data features. In \cite{HE-GP}, Fenner et al. considers the scenario where the training data is held by a model constructor. The protection of predicted data features in GP is achieved by using fully homomorphic encryption algorithm\cite{BGV}. At the same time, the computational efficiency and performance of the algorithm are improved through the interactive calculation between the model user and the model constructor.

% 第2/3段：提出我们采用SMPC的方法，阐述选择SMPC原因，讨论SMPC直接应用到GP会有什么问题，我们准备怎么解决。
To fully and securely exploit the value of different data sources in the aforementioned data-sharing scenarios for a GPR model,
%achieve the goal of designing a practical privacy-preserving GPR model,
this paper proposes to exploit the \emph{secure multi-party computation (SMPC)} \citep{Yao1986MPC} which can deal with different data-sharing scenarios with a theoretical security guarantee.
%Secure Multi-Party Computation (SMPC)\citep{SMPC} provides the security assumption of multi-party co-governance: that is, in the process of collaborative computing, each party involved cannot obtain any information about other parties' private data and model parameters after training without the authorization of the other parties. This is especially suitable for scenarios where multiple parties hold private data separately and want to jointly mine data value. By translating operations in GP into the corresponding SMPC protocols, it allows multiple participants to jointly build powerful GP models under privacy-preserving conditions. In other words, by using SMPC protocols, the value of different data sources can be fully exploited in GP, and in the meanwhile, sensitive information leakage can be prevented. 
%
Among the various types of SMPC approaches \citep{evans2018mpcsurveypaper, Goldreich2019GMW,Yao1982protocolsGC}, \emph{secret sharing} (SS) \citep{shamir1979share} is exploited in this work due to its good communication efficiency and widely applications in other ML models \citep{Mohassel2017SecureML, Wagh2019SecureNN}.
%
As the name implies, an SS-based ML approach converts all the original operations (e.g., addition, multiplication, comparison, etc.) in the ML models (e.g., neural network) into its privacy-preserving alternatives which take secretly shared data as input and produce secretly shared results with secure information communication among parties (see Section~\ref{sec:SS} for details).

Although many SS-based operations have been developed in existing privacy-preserving ML works, they are not sufficient for constructing a privacy-preserving GPR model since the matrix inversion and exponentiation operations are essential for GPR (Section~\ref{sec:PP-GPR}) but have not been well adapted to SMPC. 
%
To be specific, some works \citep{knott2021crypten} designed SMPC protocols of these two operations based on approximation methods such as Newton-Raphson iteration method and Taylor expansion, which significantly reduces their accuracy and/or efficiency. 
%The work of \cite{xia2021STR} proposed an efficient SMPC protocol for matrix inversion and exponentiation on the field of real number. However, since the modern computing devices cannot represent the exact real number field\footnote{The floating-point numbers commonly used in computers are a \emph{finite} subset of real numbers.}, the security of this protocol cannot be guaranteed in practice (see Section~\ref{sec:ppoperators} for a detailed discussion).
%{\color{blue}They are all nonlinear operations and not friendly to SMPC. It is difficult to build a secure and efficient SMPC protocol for them directly based on the existing technology. Some existing works \cite{knott2021crypten} use approximate calculation methods, such as Newton-Raphson iterations, Taylor expansion, etc. This results in the inability of these protocols to have both performance and efficiency. In the literature \cite{xia2021STR}, the author proposes an efficient SMPC protocol for matrix inversion and exponential operations on the real number domain. However, computers cannot represent the field of rational numbers, so the protocol loses its practical application value.}
%which are not friendly to SMPC, and it is difficult to construct secure and efficient protocols for them directly based on the existing technology.
%which are not commonly used in existing secure ML models and thus, not well designed for achieving both secure and efficient secret shares.
To address this issue, this work proposes new SMPC protocols for positive-definite matrix inversion and exponentiation based on SS and integrate them into the existing SS-based operations for achieving an efficient and theoretically secure GPR model.
The specific contributions of this work include:

%To address this issue, this work proposes new SMPC protocols for positive definite matrix inversion and exponentiation based on SS so that they can be combined with existing SS-based operators for efficient and theoretically secure GPR models. .

%aim to convert the original operations in DNNs into privacy-preserving alternatives that take additively shared data as input and accordingly produce additively shared results with secure data communication involved among multiple participants.

%The main techniques for constructing the SMPC protocols include secret sharing\cite{SS}, GMW protocol\cite{GMW}, obfuscation circuit\cite{GC}. Secret sharing are often used to handle linear operations such as addition and multiplication. On the other hand, GMW and obfuscation circuits are often used to construct simple privacy-preserving nonlinear operations, such as comparisons. For nonlinear operations such as exponential, approximate calculation methods such as Taylor expansion are usually used in SMPC. This results in the inability of the exponential SMPC protocol to have both performance and efficiency.
%
%In the literature \cite{Securenlp}, the author proposes an efficient SMPC protocol for exponential operations on the rational number domain. However, computers cannot represent the field of rational numbers, so the protocol loses its practical application value. There is a large number of exponential operations in GP, so using current SMPC technology directly would result in unacceptable performance or efficiency losses.  We plan to use the secret sharing technology to construct a privacy-preserving exponential algorithm with both efficiency and accuracy, and combine the existing SMPC-based privacy-preserving operators to solve the problem of privacy data leakage in GP.

% 介绍本文的贡献
% \subsection{Contributions}
%Based on SMPC technology, this paper designs the first practical privacy-preserving Gaussian process, which can protect both training data and prediction data at the same time. We summarize our contributions as follows:

\begin{itemize}[leftmargin=*,itemsep=5pt,topsep=0pt,parsep=0pt,partopsep=0pt]

\item To the best of our knowledge, this is the first work that considers to protect the privacy of a GPR model via secret sharing which can be used for various data-sharing scenarios (Section~\ref{sec:PP-GPR}).

%\item Based on the secret sharing technique, we propose efficient privacy-preserving matrix inversion and exponentiation operations. The privacy-preserving matrix inversion is developed by decomposing the original operation into addition, multiplication and other SMPC-friendly operations based on \emph{Cholesky decomposition}. The new efficient SS-based exponentiation is designed through the idea of ``confusion-correctio'' (Section~\ref{sec:ppoperators}). 

\item Based on the SS technique, we propose an efficient \emph{privacy-preserving exponentiation} algorithm through the idea of ``confusion-correction'', which is shown to be $10 \sim 70$ times faster than commonly-used approximation algorithms and can achieve theoretical correctness and security guarantees (Section~\ref{sec:ppoperators}).

\item We propose the first SS-based matrix inversion algorithm via Choseky decomposition and show that its accuracy is comparable to the plaintext algorithm with acceptable communication cost (Section~\ref{sec:ppoperators}).

\item Empirical results on two real-world datasets show that the proposed SS-based GPR algorithm can achieve accurate prediction results within a reasonable time (Section~\ref{sec:experi}).
%which makes it an superior alternative to existing privacy-preserving GPR approaches . 

%. The privacy-preserving matrix inversion is developed by decomposing the original operation into addition, multiplication and other SMPC-friendly operations based on \emph{Cholesky decomposition}. The new efficient SS-based exponentiation is designed through the idea of ``confusion-correctio'' (Section~\ref{sec:ppoperators}).

% \item We propose the first privacy-preserving fixed-point exponentiation algorithm based entirely on secret sharing through ``confusion-correctio''. It enables accurate exponentiation with fixed-point input. Experimental results show that the proposed privacy-preserving exponentiation algorithm is $38$ and $70$ times faster than Taylor expansion and iterative approximation-based exponentiation algorithms, respectively while providing higher accuracy.

% \item We propose the first privacy-preserving matrix inversion algorithm based on secret sharing and Choseky decomposition, which can invert the positive definite matrix in the construction phase of the Gaussian process model. Experiments show that the accuracy of this algorithm is comparable to plaintext.

%\item We theoretically analyze the communication cost and the security of the proposed SS-based operations (Section~\ref{sec:ppoperators}).
 
%{\color{blue}{Based on the secret sharing technique, we propose efficient secure matrix inversion and secure exponentiation operations. First, we realized the secure matrix inversion operation by using \emph{cholesky decomposition} to convert the matrix inversion operation into addition, multiplication and other SMPC-friendly operations. Then we implement the secure exponential operation through special random number construction.}}

% \item We evaluate the performance of the two new SS-based operations and the complete privacy-preserving GPR model with two real-world datasets (Section~\ref{sec:experi}).

%\item Based on the secret sharing technology, this paper proposes an efficient privacy-preserving exponential function on the integer ring through a special random number construction.

% \item We apply the constructed privacy-preserving algorithm to a Gaussian process. By combining the current best privacy-preserving algorithms, such as matrix multiplication and matrix inversion, a complete privacy-preserving Gaussian process is realized. We demonstrate under a semi-honest model that the algorithm simultaneously preserves training data features, training data labels, and predicted data features.

%We tested the proposed algorithm on multiple real datasets. The practicality of the proposed algorithm is demonstrated by the analysis of its efficiency and performance.

% \item We tested the proposed algorithm on multiple real datasets. The experimental results show that the model construction and prediction can be completed within two minutes using the privacy-preserving Gaussian process algorithm proposed in this paper, and achieves an accuracy comparable to that of plaintext.
\end{itemize}


% 介绍相关基于MPC的PPML的相关工作。
\section{Background and notations}
\label{sec:gen_inst}
%\noindent{\textbf{Basic notation.}}
%For the full text, we use $\mathcal{Z}_N$ to denote the ring of integers model $N$.% and $\mathcal{Q}_{<2^h, l_f>}$ to denote the set of fixed points with range $[-2^{h-1}, 2^{h-1}]$ and precision $l_f$. 
%$e$ denotes the Euler's constant.
%We use bold uppercase letters to denote matrices and bold lowercase letters to denote vectors. For the sake of simplicity, in some places we abuse the notion of function and use $f(\textbf{U})$ to indicate that the function $f$ acts on the matrix $\textbf{U}$.

\subsection{Gaussian process regression (GPR)}\label{sec:GPR}

Let $\mathcal{X}$ denote a $d$-dimensional input domain. For each $\mathbf{x} \in \mathcal{X}$, we assume its corresponding output $y(\mathbf{x} ) \sim \mathcal{N}(f(\mathbf{x}), \sigma^2_n)$ is a noisy observation of a function $f(\mathbf{x})$ with noise variance $\sigma^2_n$. Then, the function $f(\mathbf{x})$ can be modeled using a \emph{Gaussian process} (GP), that is, every finite subset of $\{f(\mathbf{x})\}_{\mathbf{x} \in \mathcal{X}}$ follows a multivariate Gaussian distribution.
%
Such a GP is fully specified by its \emph{prior} mean $\mu(\mathbf{x}) \triangleq \mathbb{E}[f(\mathbf{x})]$ and covariance $k(\mathbf{x}, \mathbf{x}^\prime) \triangleq \text{cov}[f(\mathbf{x}), f(\mathbf{x}^\prime)]$ for all $\mathbf{x}, \mathbf{x} \in \mathcal{X}$.
Usually, we assume that $\mu(\mathbf{x}) = 0$ and the covariance is defined by a kernel function. One example of the widely-used kernel function is the \emph{squared exponential} (SE) kernel:
%$k(\mathbf{x}, \mathbf{x}') \triangleq \sigma^2_s\text{exp}(-0.5(\mathbf{x} - \mathbf{x}')^\top \Delta^{-1} (\mathbf{x} - \mathbf{x}'))$
\begin{equation}\label{kernel}
%k(\mathbf{x}, \mathbf{x}') \triangleq \sigma^2_s\text{exp}(-\frac{||\mathbf{x} - \mathbf{x}'||^2_2}{2\ell^2})
k(\mathbf{x}, \mathbf{x}') \triangleq \sigma^2_s\text{exp}(-d(\mathbf{x}, \mathbf{x}^\prime)/2\ell^2)
\end{equation}
where $d(\mathbf{x}, \mathbf{x}^\prime) = ||\mathbf{x} - \mathbf{x}^\prime||^2_2$, $\ell$ is the length-scale and $\sigma_s^2$ is the signal variance.
%

Supposing we have a set $\mathcal{D}$ of $n$ observations: $\mathcal{D} = \{(\mathbf{x}_i, y_i)_{i=1}^n\}$ where $y_i \triangleq y(\mathbf{x}_i)$, a GPR model can 
%Supposing a column vector $\mathbf{y}_\mathcal{D} \triangleq (y(\mathbf{x}))^\top_{\mathbf{x} \in \mathcal{D}}$ of noisy outputs is observed by evaluating function $f$ at a set $\mathcal{D} \subset \mathcal{X}$ of observed inputs, a GP model can
perform probabilistic regression by providing a predictive distribution $p(f(\mathbf{x}_*)|\mathcal{D}) \triangleq \mathcal{N}(\mu_{\mathbf{x}_*|\mathcal{D}}, \sigma^2_{\mathbf{x}_*|\mathcal{D}})$ for any test input $\mathbf{x}_* \in \mathcal{X}$. Let $\mathbf{X} \triangleq (\mathbf{x}_1, \ldots, \mathbf{x}_n )^\top$ be an $n \times d$ input matrix and $\mathbf{y} = (y_1, \ldots, y_n)^\top$ be a column vector of the $n$ noisy outputs.
Then, the \emph{posterior} mean and variance of the predictive distribution $p(f(\mathbf{x}_*)|\mathcal{D})$ can be computed analytically:
% \begin{equation}\label{GPpred}
% %\begin{array}{rl}
% \mu_{\mathbf{x}_*|\mathcal{D}} \triangleq K_{\mathbf{x}_* \mathbf{X}} (K_{\mathbf{X}\mathbf{X}} + \sigma^2_n I)^{-1}\mathbf{y},
% \end{equation}
%
% \begin{equation}
% \sigma^2_{\mathbf{x}_*|\mathcal{D}} \triangleq k(\mathbf{x}_*, \mathbf{x}_*) - K_{\mathbf{x}_*\mathbf{X}} (K_{\mathbf{XX}} + \sigma^2_n I)^{-1} K^\top_{\mathbf{x}_*\mathbf{X}}.
% %\end{array}
% \end{equation}
%
\begin{equation}\label{GPpred}
\begin{array}{c}
\mu_{\mathbf{x}_*|\mathcal{D}} \triangleq \mathbf{k}^\top_* (\mathbf{K} + \sigma^2_n \mathbf{I})^{-1}\mathbf{y}\ , 
\vspace{2mm}\\
\sigma^2_{\mathbf{x}_*|\mathcal{D}} \triangleq k(\mathbf{x}_*, \mathbf{x}_*) - \mathbf{k}^\top_* (\mathbf{K} + \sigma^2_n \mathbf{I})^{-1} \mathbf{k}_* ,
\end{array}
\end{equation}
%
where $\mathbf{k}_* \triangleq k(\mathbf{x}_*, \mathbf{X}) =  (k(\mathbf{x}_*, \mathbf{x}_i))^n_{i = 1}$ is a column vector of $n$-dimension, $\mathbf{K} \triangleq k(\mathbf{X}, \mathbf{X}) = (k(\mathbf{x}_i, \mathbf{x}_j))_{i, j = 1, \ldots, n}$ is an $n \times n$ gram matrix, and $\mathbf{I}$ is an identity matrix of size $n$.
%where $K_{\mathbf{x}_* \mathbf{X}} \triangleq (k(\mathbf{x}_*, \mathbf{x}_i))^n_{i = 1}$ is a $1 \times n$ matrix, $K_{\mathbf{X}\mathbf{X}} \triangleq (k(\mathbf{x}_i, \mathbf{x}_j))_{i, j = 1, \ldots, n}$ is an $n \times n$ matrix, and $I$ is an identity matrix of size $n$.
%Due to the inversion of $\Sigma_{\mathcal{D} \mathcal{D}} + \sigma^2_n I$, computing above predictive distribution incurs $\mathcal{O}(|\mathcal{D}|^3)$ time and therefore, scales poorly in the size $|\mathcal{D}|$ of observed data \citep{Rasmussen2006}.

% Although the kernel function gives a covariance for all pairs of points in the function’s domain, when building
% a GP model to make predictions on new feature vectors, we only need to calculate the covariances between
% each pair of feature vectors in the training data and new data. Given training data $x = [x1, x2, \dots , xn]$ and $y = [y1, y2, \dots , yn] = [f(x1), f(x2), \dots , f(xn)]$, and new data $x^{*} = [x^{*}_1, x^{*}_2, \dots,x^{*}_m]$ for which we wish to predict  $y^{*} =[f(x^{*}_1), f(x^{*}_2), \dots , f(x^{*}_m)]$. we calculate:
% \begin{itemize}
% \item the $n \times n$ covariance matrix $K$ defined by $k_{i,j} = k(x_i, x_j )$,
% \item the $m \times n$ matrix $K_*$ defined by $K^{*}_{i,j}= K(x^{*}_i, x_j)$,
% \item the $m \times m$ matrix $K_{**}$ defined by $k^{**}_{i,j} = k(x^{*}_i, x^{*}_j)$.
% \end{itemize}

% Then our assumed prior distribution gives us

% \begin{equation}
% 	\begin{bmatrix} y \\ y^* \end{bmatrix} \sim \mathcal{N}\begin{pmatrix} 0, &\begin{bmatrix} K & K_*^T \\ K_* & K_{**} \end{bmatrix} \end{pmatrix}
% \end{equation}

% From this we get that the elements of $y^{*}$ have Gaussian distributions with means given by $\bar{y}^{*} = K_{*}K^{-1}y$  and variances given by $var(y^{*}) = diag(K_{**})- diag(K_{*}K^{-1}K_{*}^T)$.

\subsection{Secure multi-party computation}

\emph{Secure multi-party computation} (SMPC) is a type of cryptography technique for multiple parties to jointly compute an operation $f$ without exposing the privacy of the data to any of them during the computation.
%In this section, we will introduce a standard SMPC model -- semi-honest security model and the techniques that are used to construct SMPC protocols for this security model.
%
In this work, we adopt the \textit{semi-honest security} (also known as \emph{honest-but-curious}) model which is one of the standard security models in SMPC and has been widely used in existing privacy-preserving machine learning algorithms \citep{liu2017oblivious,Mohassel2018ABY3,Mohassel2017SecureML,ryffel2020ariann,Wagh2019SecureNN}.
% 半诚实模型是什么？
In a \emph{semi-honest} security model, the parties are assumed to follow the SMPC protocol but can try to use the obtained data-sharing and intermediate results to infer the information that is not exposed to them during the execution of the protocol.
%
Next, we will first present a \emph{secret sharing} technique designed based on the \emph{semi-honest security} model to construct SMPC protocols and then, introduce the algebraic structure used for designing the SMPC protocal in this work.

% \subsubsection{Semi-honest security model}
% \label{subsec:semi-honest}
% % 半诚实模型是MPC的标准安全模型。
% The \textit{semi-honest security} (also known as \emph{honest-but-curious}) model is one of the standard security models in SMPC and has been widely used in existing privacy-preserving machine learning algorithms \citep{liu2017oblivious,Mohassel2018ABY3,Mohassel2017SecureML,ryffel2020ariann,Wagh2019SecureNN}.
% % 半诚实模型是什么？
% In a \emph{semi-honest} model, the parties follow the SMPC protocol but can try to use the obtained data-sharing and intermediate results to infer the information that is not exposed to them during the execution of the protocol.  

% 如何在半诚实模型下证明安全。
% The complete security, privacy, and correctness of an SMPC protocol can be proved by using the \textit{simulation-based} method \citep{Lindell2017simulate}. 
% Specifically, let $\pi_f$ be an SMPC protocol of the operation $f$, $Sim$ be a \emph{probabilistic polynomial time} simulator which can simulate the view of each party through the input, output, and some public parameters during the execution of $\pi_f$.
% %, so that the semi-honest adversary $\textcal{A}$ cannot distinguish the simulated view from the real view..
% If we can construct a $Sim$ such that the semi-honest party cannot distinguish the simulated view from its real view,
% %such that during the execution of $\pi$, $Sim$
% %which can simulate the view of the party through the input, output, and some public parameters during the execution of $\pi$, so that the semi-honest adversary $\textcal{A}$ cannot distinguish the simulated view from the real view.
% we can claim that the SMPC protocol is secure under the semi-honest model.
% A formal definition of \emph{security} for an SMPC protocol is in Appendix~\ref{app:proof}.
% %
% The general composability framework \citep{canetti2001universally} guarantees that the security of a complete ML algorithm can be obtained by ensuring the security of each operation in the algorithm and combining them. 
%The complete security, privacy and correctness of our proposed algorithm is proved by using the \textit{simulation-based} method \citep{Lindell2017simulate}. The security of a SMPC protocol relies on the \emph{indistinguishability} of shares, which can be informally described as the shares received by each server are indistinguishable from a random string.

% \subsubsection{Fix-point encoding}
% \label{subsec:fix-point}
% For security reasons, SMPC protocols usually need to be constructed in a ring or field of integers. Since floating-point types are generally used in the ML algorithms, 
% %Floating point types are generally used in machine learning. However, SMPC algorithms usually need to be constructed in a ring or field of integers, for security reasons. Therefore,
% a fix-point encoding method for converting floating-point numbers to fixed point numbers is required in SMPC-based ML algorithms.
% The fix-point encoding method in \cite{knott2021crypten} encodes all data as $l$ ($l = 32$ or $64$) bits and performs calculations on the integer ring $Z_{L}$ with $L = 2^l$. For any floating-point number $u$, they multiply $u$ by a scaling factor $B$ to get $u_B$, and round the result to the nearest integer. When $B = 2^k$, this method can preserve $k$ digits of precision. The decoding process can be achieved by calculating $u_B/B$. 

\subsubsection{Secret Sharing}\label{sec:SS}
%\subsection{Notation}

\emph{Secret sharing} (SS) is a technique independently proposed by \citet{shamir1979share} and \citet{blakley1979safeguarding} with its full name called $(t, m)$-threshold secret sharing schemes, where $m$ is the number of parties and $t$ is a threshold value.
The security of SS requires that any less than $t$ parties cannot obtain any secret information jointly.
%Secret sharing is independently proposed by Shamir\cite{shamir1979share} and Blackly\cite{blakley1979safeguarding}, and their schemes are called $(t, n)$-threshold secret sharing schemes. Among them, $n$ represents the number of participants and $t$ represents the threshold value, and
%its security requires that any less than $t$ participants cannot obtain any secret information jointly.
As a special case of secret sharing, $(2,2)$-\emph{additive} secret sharing contains two algorithms: $Shr(\cdot)$ and $Rec(\cdot, \cdot)$.
%
Let $\mathcal{Z}_L$ denote the ring of integers modulo $L$ and $[\![u]\!]=([u]_0, [u]_1)$ be the additive share of any integer $u$ on $\mathcal{Z}_L$.
% The additive share of $x$ on $Z_L$ is denoted by $([x]_0, [x]_1)$.
%It can be generated by employ the algorithm
$Shr(u) \rightarrow ([u]_0, [u]_1)$ is used to generate the share by randomly selecting a number $r$ from $\mathcal{Z}_L$, letting $[u]_0=r$, and computing $[u]_1=(u - r)\mod L$.
Note that due to the randomness of $r$, neither a single $[u]_0$ nor $[u]_1$ can be used to infer the original value of $u$. The algorithm $Rec([u]_0, [u]_1)\rightarrow u$ is used to reconstruct the original value from the additive shares, which can be done by simply calculating $([u]_0 + [u]_1) \mod L$.  

The additive secret sharing technique has been widely used to construct SMPC protocols for ML operations (e.g., addition, multiplication, etc.) such that both the inputs and outputs of the protocol can be \emph{additive} shares of the original inputs and outputs:
$
\pi_f([inputs]_0, [inputs]_1) \rightarrow [f]_0, [f]_1
$
where $\pi_f$ denotes an SMPC protocol of the operation $f$.
%
To further elaborate the SS technique, we briefly introduce the SS-based multiplication protocol below, which is essential in many privacy-preserving ML algorithms and will also be widely used in this work.

\textbf{SS-based multiplication $u\cdot v$\ .} Let ${P_j}$ with $j \in \{0,1\}$ be two parties that are used to execute the SMPC protocol. Each party $P_j$ will be given one additive share $([u]_j, [v]_j)\in \mathcal{Z}_L$ of the operation inputs for $j \in \{0,1\}$. Then, the additive shares of $u\cdot v$ can be computed with Beaver-triples \citep{Beaver1991triples}: $(a,b,c)$ where $a, b \in \mathcal{Z}_{L}$ are randomly sampled from $\mathcal{Z}_{L}$ and $c = a \cdot b \mod L$. Specifically, for each $j \in \{0, 1\}$,  $P_j$ first calculates $[d]_j = [u]_j-[a]_j$ and $[e]_j = [v]_j - [b]_j$. Then, they send the $[d]_j$ and $[e]_j$ to each other and reconstruct $d= Rec([d]_0, [d]_1)$ and $e =  Rec([e]_0, [e]_1)$. Finally, the additive share of $u \cdot v$ can be computed using $[u\cdot v]_j = -jd \cdot e +[u]_j \cdot e + d \cdot [v]_j + [c]_j$.
%We use $\mathcal{F}_{Mul}$ to denote the privacy-preserving multiplication functionality.
To complete the SS-based multiplication, both parties need to spend $1$ round of two-way communication and transmit two ring elements.
%(i.e., $[d]_j$ and $[e]_j$).

The SS-based multiplication protocol is extended to matrix multiplication in the work of \citet{Mohassel2017SecureML}.
%The privacy-preserving matrix multiplication algorithm is extended in the literature \citep{Mohassel2017SecureML} based on the privacy-preserving multiplication.
Let $\mathcal{F}_{matMul}$ denote the SS-based matrix multiplication functionality, $\mathbf{U}$ and $\mathbf{V}$ be two matrices of size $m\times n$ and $n\times k$, respectively.
The SS-based matrix multiplication $\mathcal{F}_{matMul}(\mathbf{U}, \mathbf{V})$ still requires only $1$ rounds of bidirectional communication between parties $P_0$ and $P_1$ but the transmitted ring elements are of size $(m+n)\times k$.
%To implement the privacy-preserving matrix multiplication about them, the online phase requires only $1$ rounds of bidirectional communication between the two parties and the transmission of $(m+n)k$ ring elements, respectively. Correspondingly, in the preprocessing phase, a matrix multiplication triple $\mathbf{A},\mathbf{B},\mathbf{C}$ is required, where $\mathbf{A},\mathbf{B}$ have the same dimensions as the matrices $\mathbf{U},\mathbf{V}$ and $\mathbf{C} = \mathbf{A}\mathbf{B}\mod N$.
%We will use $\mathcal{F}_{matMul}$ to denote the SS-based matrix multiplication functionality in this work.

Unfortunately, there exist many operations (e.g., exponentiation, matrix inversion, etc.) that cannot be constructed using purely additive secret sharing on $\mathcal{Z}_L$. Some approximation methods such as Newton-Raphson iteration method and Taylor expansion have been exploited for designing additive SS-based protocols of these operations. Details of the approximation methods and other SMPC protocols can be found in the work of \citet{knott2021crypten}. 

% Let's take the addition operation $u+v$ as an example:
% Let ${P_j}$ with $j \in \{0,1\}$ be two parties that are used to execute the SMPC protocol. Each party $P_j$ will be given one additive share $([u]_j, [v]_j)$ of the operation inputs for $j \in \{0,1\}$.   
% %For an SMPC protocol with two parties ${P_j}_{j \in \{0,1\}}$. For $j \in \{0,1\}$, it is assumed that the $P_j$ has an additive share $[x]_j, [y]_j$.
% Then, each party can calculate one additive share of the operation output $[u+v]_j = [u]_j + [v]_j$ locally.

%For multiplication, A common practice for multiplication is to use Beaver-triples\cite{Beaver1991triples}. For all $j \in \{0, 1\}$,  $P_j$ take the additive share $[x]_j, [y]_j$ of $x, y$ as inputs, respectively. Then they interactive computing the additive shares of $x\cdot y $ privately with Beaver-triples $(a,b,c)$, for $a, b \in Z_{L}, c = a \cdot b$. Specifically, for $j \in \{0, 1\}$,  $P_j$ calculates $[d]_j = [x]_j-[a]_j$, $[e]_j = [y]_j - [b]_j$, and send them to each other. Then the $P_j$ reconstructs $d= Rec([d]_0, [d]_1) = [d]_0+[d]_1, e =  Rec([e]_0, [e]_1) = [e]_0+[e]_1$ and gets $[x\cdot y]_j = -jd \cdot e +[x]_j \cdot e + d \cdot [y]_j + [c]_j$.

\subsubsection{Fixed-Point Representation}

As has been shown above, the SS-based SMPC protocols are constructed in a ring of integers due to security reasons.
%
In practice, the ML algorithms such as GPR are usually implemented using floating-point numbers.
However, it has been shown that the SMPC protocols designed based on floating-point numbers are inefficient ~\citep{aliasgari2012secure} and fixed-point representation is a better choice. 
%using fixed-point numbers is a more efficient way.
%
%
%Similar to~\citet{Mohassel2017SecureML}, all the algorithms in this paper are performed on the ring of integers $\mathcal{Z}_L$, where $L = 2^l$, obtained from the fixed-point mapping.

Specifically, the fixed-point encoding method represents all data as $l$ bits. Let $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$ be a set of fixed-point numbers with a precision of $l_f$ (i.e., $l_f$ fractional bits) mapped from $\mathcal{Z}_L$ and $L = 2^l$. 
For floating-point numbers in the range\footnote{We assume that all the numbers appeared in an ML algorithm are in this range. Appropriate $l$ and $l_f$ need to be selected for avoiding underflow and overflow issues.} $[-2^{l-l_f-1}, 2^{l-l_f-1})$, this work will first round them to the nearest fixed-point numbers in $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$ and then, map them to the integers in $\mathcal{Z}_L$ by multiplying the converted fixed-point numbers with $2^{l_f}$.
%
For example, given $l = 5$ and $l_f = 3$, a floating-point number $1.125123$ in $[-2,2]$ is firstly rounded to the fixed-point number $1.125$ in $\mathcal{Q}_{<\mathcal{Z}_{2^5},3>}$ and then converted to an element in $\mathcal{Z}_{2^5}$ by $(1.125 \times 2^3) \mod 2^5= 9$. Conversely, an integer $11 \in \mathcal{Z}_{2^5}$ can be converted in  $\mathcal{Q}_{<\mathcal{Z}_{2^5},3>}$ as $11/2^3 = 1.375$.
%
%for any integer $x$ in $Z_L$, the corresponding number of fixed points in $\mathcal{Q}_{<\mathcal{Z}_{L},l_f>}$ is denoted by $\check{x}$. For example, $x = 11 \in Z_{2^5}$ can be converted in  $\mathcal{Q}_{<\mathcal{Z}_{2^5},3>}$ by $\check{x} = 11/2^3 = 1.375 $.

All the algorithms in this paper are performed on $\mathcal{Z}_L$ and $Q_{<\mathcal{Z}_L,l_f>}$.
By choosing appropriate $l$ and $l_f$, the fixed-point-based SMPC protocols can achieve a desirable compromise between efficiency and accuracy.
%
To ease notations, we will use lowercase letters to represent either floating-point or integer numbers and $\check{x}$ to represent a fixed-point number in $\mathcal{Q}_{<\mathcal{Z}_{L},l_f>}$ converted from $x$. The algorithm $Shr(x)$ in Section~\ref{sec:SS} will first convert $x$ to its corresponding representation in $\mathcal{Z}_L$ if the input $x$ is a floating-point number. 

% To better understand the operation using $l_f$-scaled floats, consider the following addition example. Let $\check{a} = 0.281269$, $\check{b} =0.406256$, the original addition $\check{a} + \check{b} = 0.687524$. Let $l_f = 5, L = 2^l = 2^7 = 128$. 
% The addition using $l_f$-scaled is as follows.
%   $$ \left\{
%   \begin{array}{rcl}
%   a = 0.\underbrace{28125}_{l_f=5}\in Q_{<Z_{32},5>}, \hat{a} = \lfloor a\cdot 2^{l_f}\rfloor=9\\
%   b = 0.\underbrace{40625}_{l_f=5}\in Q_{<Z_{32},5>}, \hat{b} = \lfloor b\cdot 2^{l_f}\rfloor=13\\
%   \end{array} \right. $$ 
  
% $\hat{c} = (\hat{a} + \hat{b})\mod 128 =22, c = \hat{c}/2^{l_f} = 22/ 2^5 = 0.6875$.

%Let $Q_{<\mathcal{Z}_L,l_f>}$ denotes the set of these fixed-point numbers. Then, by multiplying with $2^{l_f}$, the fixed point numbers in $Q_{<\mathcal{Z}_L,l_f>}$ can be converted to integers in $Z_L$. For example, for a floating-point number $1.125123$ in $[-2,2]$, it is first converted to a fixed-point number $1.25$ in $\mathcal{Q}_{<\mathcal{Z}_{2^5},3>}$. It is then converted to an element in $Z_{2^5}$ by $(1.125 * 2^3) \mod 2^5= 9$. Conversely, for any integer $x$ in $Z_L$, the corresponding number of fixed points in $\mathcal{Q}_{<\mathcal{Z}_{L},l_f>}$ is denoted by $\check{x}$. For example, $x = 11 \in Z_{2^5}$ can be converted in  $\mathcal{Q}_{<\mathcal{Z}_{2^5},3>}$ by $\check{x} = 11/2^3 = 1.375 $.


% \noindent\textbf{Fix-point ring.} The ring of fixed points on $Z_L$ with precision $l_f$ is denoted as $Q_{<Z_L,l_f>}$, where $Z_L$ is the ring of integers modulo $L$ and $l_f, l_f < L$ is the length of fractional digits. Specifically, $Q_{<Z_L,l_f>} = [-2^{l-1}, 2^{l-1})$ is the set of integers obtained by mapping all fixed points in $[-2^{l-l_f-1}, 2^{l-l_f-1})$. For any fixed point $x \in [-2^{l-l_f-1}, 2^{l-l_f-1})$, it is denoted as $\hat{x}$ in $Q_{<Z_L,l_f>}$. Conversely, for any integer $y \in Z_L$, the number of fixed points corresponding to it is denoted by $\check{y}$. For example the fixed number $x = 1.125$ is represented in $Q_{<Z_{2^5},3>}$ as $\hat{x} = (1.125 * 2^3) \mod 2^5= 9$ and $y = -1.125$ is represented as $\hat{y} = (-1.125 * 2^3)\mod 2^5= -9 $.
% % The number of fixed points corresponding to $x = 11 $ is $\check{x} = 11/2^3 = 1.375 $.

\section{Privacy-Preserving GPR}\label{sec:PP-GPR}

\begin{figure}
	\centering
	\includegraphics[scale=0.43]{fig/process-1 (2).pdf}
	% \fbox{\rule[-.5cm]{0cm}{4cm} %\rule[-.5cm]{4cm}{0cm}}
	\caption{The overall framework of PP-GPR.}
	\label{fig:gpr}
\end{figure}

In this section, we propose to exploit the SMPC technique for constructing a \emph{privacy-preserving GPR} (PP-GPR) algorithm.
%
%we will introduce the architecture of the privacy-preserving GPR model and the SS-based operations required in the proposed method.
%\subsection{Privacy-preserving GPR architecture.}
%
The overall framework of the \emph{Privacy-preserving GPR} (PP-GPR) model is shown in Fig.~\ref{fig:gpr}.
As can be seen, the PP-GPR adopts a three-party SMPC architecture with two computing servers and one assistant server. Let $S_0$ and $S_1$ represent the two computing servers and $T$ be the assistant server.
%
Each computing server takes one additive share of the data as input, performs the calculations according to the steps of the PP-GPR algorithm, and then outputs the additive share of the GPR predictive results. The assistant server is responsible for generating random numbers required during the execution of the SS-based protocols in the PP-GPR algorithm. 
%
%
%\subsection{Protocol parameters \& input}
%
The proposed algorithm exploits the SS-based operations for achieving privacy-preserving GPR on all three data-sharing scenarios shown in Fig.~\ref{fig:datashare}. The complete steps are illustrated in Algorithm~\ref{alg-PPGPR}.

\begin{algorithm}[t] \footnotesize
	\caption{Privacy-preserving GPR}
	\label{alg-PPGPR}
	\textbf{Setup:} The servers determine $\mathcal{Z}_L$ and $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$. The data owners convert 
	their private observations $\mathcal{D} = (\mathbf{X}, \mathbf{y})$ and predicted samples $\mathbf{x}_*$ into $([\! [\mathbf{x}_*]\!], [\![\mathbf{X}]\!], [\![\mathbf{y}]\!])$. \\
	%
	%the parameters $(\ell, \sigma_s^2, \sigma_n^2)$ of the Gaussian process regression. \\
	\textbf{Input:} For $j \in \{0, 1\}$, $S_j$ holds the shares $([\mathbf{x}_*]_j, [\mathbf{X}]_j, [\mathbf{y}]_j)$, and the hyperparameters $(\ell, \sigma_s^2, \sigma_n^2)$.
    \begin{algorithmic}[1]
     %\Statex \textbf{Setup.} The servers determine an integer ring $\mathcal{Z}_L$, a fixed-point number set $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$, and the parameters $(\ell, \sigma_s^2, \sigma_n^2)$ of the Gaussian process regression. %$\ell$ is the length-scale and $\sigma_s^2$ is the signal variance. 
    %\Statex \textbf{Input.} $S_j$ holds the shares $([\mathbf{x}_*]_0, [\mathbf{X}]_j, [\mathbf{y}]_j)$, where $\mathbf{x}_*$ is test input and $(\mathbf{X}, \mathbf{y})$ are private observations.
    \State // \textbf{Model construction stage.}
   % \State \quad $[\![d(\mathbf{X},\mathbf{X})]\!] \leftarrow \mathcal{F}_{matMul}([\![\mathbf{X}]\!],[\![\mathbf{X}]\!]) $
    \State \qquad $[\![d(\mathbf{X},\mathbf{X})]\!] \leftarrow \mathcal{F}_{dist}([\![\mathbf{X}]\!],[\![\mathbf{X}]\!])$
    %+ \mathcal{F}_{Mul}([\![\mathbf{X}]\!],[\![\mathbf{X}^\top]\!]) -2\mathcal{F}_{matMul}([\![\mathbf{X}]\!],[\![\mathbf{X}^\top]\!]) $.
  \State \qquad $[\![\mathbf{K}]\!] \leftarrow \sigma_s^2\cdot \mathcal{F}_{PPExp}([\![-d{(\mathbf{X},\mathbf{X})}/2\ell^2]\!])$
    \State \qquad $[\![Inv]\!] \leftarrow \mathcal{F}_{MatInv}([\![(\mathbf{K} + \sigma^2_n \mathbf{I})]\!])$
    %
    \State // \textbf{Prediction stage.}
   %\State // \ Perform the calculations of $[\![d{(\mathbf{x}_*, \mathbf{X})}]\!]$ and $[\![d{(\mathbf{x}_*,\mathbf{x}_*)}]\!]$.
   \State \qquad $[\![d{(\mathbf{x}_*, \mathbf{X})}]\!] \leftarrow \mathcal{F}_{dist}([\![\mathbf{x}_*]\!],[\![\mathbf{X}]\!])$
   \State \qquad $[\![d{(\mathbf{x}_*,\mathbf{x}_*)}]\!] \leftarrow \mathcal{F}_{dist}([\![\mathbf{x}_*]\!],[\![\mathbf{x}_*]\!])$
   
  \State \quad // \ Compute the kernel matrices.
   \State \qquad $[\![\mathbf{k}_*]\!] \leftarrow \sigma_s^2\cdot \mathcal{F}_{PPExp}([\![-d{(\mathbf{x}_*,\mathbf{X})}/2\ell^2]\!])$
   \State \qquad $[\![k(\mathbf{x}_*,\mathbf{x}_*)]\!] \leftarrow \sigma_s^2\cdot \mathcal{F}_{PPExp}([\![-d{(\mathbf{x}_*,\mathbf{x}_*)}/2\ell^2]\!])$
    \State \quad // \ Compute the predictive mean and variance. 
    \State \qquad $[\![\mu^2_{\mathbf{x}_*|\mathcal{D}}]\!] \leftarrow \mathcal{F}_{matMul}(\mathcal{F}_{matMul}([\![\mathbf{k}^\top_*]\!],[\![Inv]\!]), [\![\mathbf{y}]\!])$
    \State \qquad $[\![\Lambda]\!] \leftarrow \mathcal{F}_{matMul}(\mathcal{F}_{matMul}([\![\mathbf{k}^\top_*]\!],[\![Inv]\!]), [\![\mathbf{k}_*]\!])$
   \State \qquad $[\![\sigma^2_{\mathbf{x}_*|\mathcal{D}}]\!] \leftarrow [\![k(\mathbf{x}_*,\mathbf{x}_*)]\!] - [\![\Lambda]\!]$
    %
    %\State \textbf{Output.} $S_j$ outputs the share $[\mu_{\mathbf{x}_*|\mathcal{D}}]_j, [\sigma^2_{\mathbf{x}_*|\mathcal{D}}]_j$.
    \end{algorithmic}
    \textbf{Output:} $S_j$ outputs the share $[\mu_{\mathbf{x}_*|\mathcal{D}}]_j, [\sigma^2_{\mathbf{x}_*|\mathcal{D}}]_j$ for $j \in \{0, 1\}$.
\end{algorithm}

\subsection{The Algorithm Setups}

To ensure a coherent execution of the algorithm, consensus must be reached among the servers ($S_0$, $S_1$, and $T$), data owners, and users regarding the algebraic structure to be employed. Specifically, an appropriate choice of $l$ and $l_f$ needs to be made for $\mathcal{Z}_{2^l}$ and $\mathcal{Q}{<\mathcal{Z}_{2^l},l_f>}$, respectively. Once consensus is established, the data owners and users proceed with the conversion of their private observations $\mathcal{D} = (\mathbf{X}, \mathbf{y})$ and test inputs $\mathbf{x}_*$ into shared representations denoted as $([\! [\mathbf{x}_*]\!], [\![\mathbf{X}]\!], [\![\mathbf{y}]\!])$. This conversion is accomplished using the function $Shr(\cdot)$ and each resulting share $( [\mathbf{x}_*]_j, [\mathbf{X}]_j, [\mathbf{y}]_j)$ is transmitted to the respective computing server $S_j$ for $j \in \{0, 1\}$. Afther that, the servers perform GPR's privacy-preserving model construction and inference. 

Let us consider an illustrative example to showcase the process. Supposing $l = 5$ and $l_f = 3$, the model user aims to privately predict the output of a test input $\mathbf{x}_* = (0.625, 0.375, 0.375)$. To achieve this, the user firstly converts $\mathbf{x}_*$ into $\mathcal{Z}_{2^3}$ by $\mathbf{x}_* \cdot 2^3 = (5, 3, 3)$
%. Subsequently, the user
, independently generates random values $[\mathbf{x_*}]_0 = (6, 9, 6)$, and then calculates $[\mathbf{x}_*]_1 = \left((\mathbf{x}_* - [\mathbf{x}_*]_0) \mod 32 \right) = (31, 26, 29)$.
%, resulting in $[\mathbf{x}_*]_1 = (31, 26, 29)$.
The computed value $[\mathbf{x}_*]_j$ is then transmitted to the computing server $S_j$ for $j \in \{0, 1\}$. In a similar manner, the data owners employ the $Shr(\cdot)$ mechanism to send all the values pertaining to their private observations $\mathcal{D}$ to the respective computing servers. 

Note that since $Shr(\cdot)$ is applied independently to each variable in $\mathbf{X}$, $\mathbf{y}$ and $\mathbf{x}_*$, the shares of the data can be computed easily no matter how the variables in $\mathbf{X}$, $\mathbf{y}$ and $\mathbf{x}_*$ are partitioned among the data owners. Therefore, the SS-based GPR algorithm can handle HDS, VDS, and PDS scenarios straightforwardly, which makes it practical enough to be used in various real-world applications.

The GPR hyperparameters $(\ell, \sigma_s^2, \sigma_n^2)$ are assumed to be known a priori and publicly shared between the computing servers. The privacy-preserving optimization of the hyperparameters will be considered in future work.

%In the beginning of the algorithm, the servers ($S_0$, $S_1$, and $T$), the data owners and users need to reach a consensus on the algebraic structure (i.e., choose an appropriate $l$ and $l_f$ for $\mathcal{Z}_{2^l}$ and $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$) used in this run of the algorithm.
%Then, the data owners and users convert their private observations $\mathcal{D} = (\mathbf{X}, \mathbf{y})$ and the test inputs $\mathbf{x}_*$ into $([\! [\mathbf{x}_*]\!], [\![\mathbf{X}]\!], [\![\mathbf{y}]\!])$ via $Shr(\cdot)$ and send each share $( [\mathbf{x}_*]_j, [\mathbf{X}]_j, [\mathbf{y}]_j)$ to computing server $S_j$ for $j \in \{0, 1\}$.
% %%%%%%%%%%%%%%
% The process can be illustrated using the example of a user sharing predicted data $\mathbf{x} = (3, 5, 4)$ with modulus $L = 2^3$ using $Shr(\cdot)$. First, the user locally selects a random value $[\mathbf{x}]_0 = (1, 6, 2)$ and then calculates $[\mathbf{x}]_1 = (\mathbf{x} - [\mathbf{x}]_0) \mod 8 = (2, 7, 2)$. The resulting value $[\mathbf{x}]_j$ is then sent to the computing server $S_j$. Similarly, data owners can use $Shr(\cdot)$ to send all the values in their private $\mathcal{D}$ observations to the computing servers. %%%%%%%%%%%



%We can publicly share the plaintext $-1/2\ell^2$ instead of $\ell$ to avoid the SS-based division operation which needs to be approximated and may reduce the accuracy of the kernel computation.

%Here we present the parameters and inputs used by the privacy-preserving Gaussian process regression algorithm for model construction and prediction.
%
%\textbf{Protocol parameters.}
%Before the protocol is executed, the following parameter information needs to be negotiated between servers. (1) An integer ring $\mathcal{Z}_L$, $L = 2^{l}$; (2) A fixed-point set $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$;
%(3) The parameters $(\ell, \sigma_s^2, \sigma_n^2)$ of the Gaussian process regression, where $\ell$ is the length-scale; $\sigma_s^2$ is the signal variance and $\sigma_s^2$ is the noise variance.  

% \textbf{Protocol inputs.}
% The inputs to the privacy-preserving Gaussian process include the shares of the private observations $(\mathbf{X}, \mathbf{y})$ and the predicted samples $\mathbf{x}_*$. Before running the protocol, data owners convert their private observations $\mathcal{D} = (\mathbf{X}, \mathbf{y})$ and predicted samples $\mathbf{x}_*$ into $([\! [\mathbf{x}_*]\!], [\![\mathbf{X}]\!], [\![\mathbf{y}]\!])$ via $Shr(\cdot)$ and then send each share to the appropriate computing server. 
% After receiving the share, the computing servers complete the PP-GPR model construction and prediction by performing the SS-based operations and receive a share of the prediction result.  

%\subsection{Protocol design}

\subsection{The Algorithm Execution Steps}

Once the computing servers receive the shares of all the data and hyperparameters, they start to execute the SS-based protocols for PP-GPR and output the shares of the predictive results.
%
Similar to the conventional GPR, the PP-GPR algorithm contains two stages: model construction and prediction.
At the model construction stage (Lines 1-4), the servers first compute secret shares of the distance matrix $d(\mathbf{X}, \mathbf{X}) \triangleq (d({\mathbf{x}_i, \mathbf{x}_j}))_{i, j = 1, \ldots, n}$. Let $\mathcal{F}_{dist}(\mathbf{X}, \mathbf{X}')$ be the SS-based protocol for computing the shares of the distance matrix between $\mathbf{X}$ and $\mathbf{X}'$. $[\![d(\mathbf{X}, \mathbf{X})]\!] = \mathcal{F}_{dist}(\mathbf{X}, \mathbf{X}) = ([\![d({\mathbf{x}_i, \mathbf{x}_j})]\!])_{i, j = 1, \ldots, n}$ where $\mathcal{F}_{dist}$ can be constructed using conventional SS-based addition and (matrix) multiplication operations. Then, the servers compute $[\![\mathbf{K}]\!]$ by calling a \emph{privacy-preserving exponentiation} algorithm denoted as $\mathcal{F}_{PPExp}$ and compute $[\![Inv]\!]=[\![(\mathbf{K} + \sigma^2_n \mathbf{I})^{-1}]\!]$ by calling a \textit{privacy-preserving matrix-inverse} algorithm $\mathcal{F}_{MatInv}$ with the inputs $[\![\mathbf{K}]\!]$ and $\sigma^2_n$. The design of $\mathcal{F}_{PPExp}$ and $\mathcal{F}_{MatInv}$ will be discussed later in Section~\ref{sec:ppoperators}. 

At the prediction stage (Lines 5-14), the servers first compute $[\![\mathbf{k}_* ]\!]$ and $[\![k({\mathbf{x}_*, \mathbf{x}_*})]\!]$ by calling $\mathcal{F}_{dist}$ and $\mathcal{F}_{PPExp}$ and then, obtain the shares of the predictive mean $[\![\mu_{\mathbf{x}_*|\mathcal{D}}]\!] 
= [\![\mathbf{k}^\top_*]\!][\![Inv]\!][\![\mathbf{y}]\!]$ and variance $[\![\sigma^2_{\mathbf{x}_*|\mathcal{D}}]\!] = [\![\mathbf{k}^\top_*]\!]-[\![k_{\mathbf{x}_*, \mathbf{X}}]\!][\![Inv]\!][\![\mathbf{k}_*]\!]$ according to \eqref{GPpred}
% \begin{align}\label{eq}
% [\![\mu_{\mathbf{x}_*|\mathcal{D}}]\!] 
% &= [\![\mathbf{k}^\top_*]\!][\![Inv]\!][\![\mathbf{y}]\!], \nonumber \\
% [\![\sigma^2_{\mathbf{x}_*|\mathcal{D}}]\!] &= [\![\mathbf{k}^\top_*]\!]-[\![k_{\mathbf{x}_*, \mathbf{X}}]\!][\![Inv]\!][\![\mathbf{k}_*]\!] \nonumber
% \end{align}
by calling $\mathcal{F}_{matMul}$\ .
%
%At the prediction stage, the model user also calls $Shr(\cdot)$ to produce $[\mathbf{x}_*]_0$ and $[\mathbf{x}_*]_1$ for each test input and sends each share to the corresponding computing server. After receiving the data share, the computing server $S_j$ executes the SS-based predictive prediction operations and obtain the share of the predictive results $[\mu_{x_*|\mathcal{D}}]_j$ and $[\sigma^2_{x_*|\mathcal{D}}]_j$ for $j \in \{0, 1\}$.
%Finally, the computing servers send the shares of the prediction results to the model user such that the users can achieve the prediction results locally via $Rec(\cdot)$.

% The goal of the protocol is to output the shares of the predicted results. In order to achieve this goal, the following steps are required.
% (1) The calculation servers compute $[\![d_{(\mathbf{X},\mathbf{X})}]\!], [\![d_{(\mathbf{x}_*,\mathbf{X})}]\!]$ and $[\![d_{(\mathbf{x}_*,\mathbf{x}_*)}]\!]$ by secure matrix multiplication with the input $[\![\mathbf{x}_*]\!],[\![\mathbf{X}]\!]$. (2) The computing servers compute $[\![k_{\mathbf{X}, \mathbf{X}}]\!], [\![k_{ \mathbf{x}_*},\mathbf{X}]\!]$ and $[\![k_{\mathbf{x}_*, \mathbf{x}_*}]\!]$ by calling the \textit{privacy-preserving exponentiation} algorithm with the input $[\![d_{(\mathbf{X},\mathbf{X})}]\!],[\![d_{(\mathbf{x}_*,\mathbf{X})}]\!], [\![d_{(\mathbf{x}_*,\mathbf{x}_*)}]\!], \ell, \sigma_s^2$. (3) The computing servers compute $[\![Inv]\!]=[\![(k_{\mathbf{X}, \mathbf{X}} + \sigma^2_n I)^{-1}]\!]$ by calling the \textit{privacy-preserving matrix-inverse} algorithm with the input $[\![k_{\mathbf{X}, \mathbf{X}}]\!], \sigma^2_n$. (4) The computing servers compute %
% \begin{align}\label{eq}
% [\![\mu_{\mathbf{x}_*|\mathcal{D}}]\!] 
% &= [\![k_{\mathbf{x}_*, \mathbf{X}}]\!][\![Inv]\!][\![\mathbf{y}]\!], \nonumber \\
% [\![\sigma^2_{\mathbf{x}_*|\mathcal{D}}]\!] &= [\![k_{\mathbf{x}_*, \mathbf{x}_*}]\!]-[\![k_{\mathbf{x}_*, \mathbf{X}}]\!][\![Inv]\!][\![k^\top_{\mathbf{x}_*, \mathbf{X}}]\!] \nonumber
% \end{align}
% %
% by calling the \textit{privacy-preserving multiplication} algorithm with the input $[\![k_{\mathbf{x}_*, \mathbf{X}}]\!], [k_{\mathbf{x}_*, \mathbf{x}_*}]\!], [\![(k_{\mathbf{X}, \mathbf{X}} + \sigma^2_n I)^{-1}]\!], [\![\mathbf{y}]\!]$. The specific steps are described in the algorithm \ref{alg-PPGPR}. %The complete workflow of PP-GPR is shown in Fig.~\ref{fig:flow}.

% At the prediction stage, the model user also calls $Shr(\cdot)$ to produce $[\mathbf{x}_*]_0$ and $[\mathbf{x}_*]_1$ for each test input and sends each share to the corresponding computing server. After receiving the data share, the computing server $S_j$ executes the SS-based predictive prediction operations and obtain the share of the predictive results $[\mu_{x_*|\mathcal{D}}]_j$ and $[\sigma^2_{x_*|\mathcal{D}}]_j$ for $j \in \{0, 1\}$.
% Finally, the computing servers send the shares of the prediction results to the model user such that the users can achieve the prediction results locally via $Rec(\cdot)$. 

%
% \subsection{Operations in a privacy-preserving GPR algorithm}\label{subsec:operators}
% Next, we will introduce the operations involved in the conventional GPR and how they can be converted into privacy-preserving operations using the SS-based MPC protocols.
% %
% As can be seen from \eqref{GPpred}, there are three major operations in a GPR model: kernel function computation, matrix multiplication, and matrix inversion. 

% \textbf{Matrix multiplication:} The \emph{privacy-preserving matrix multiplication} (PP-MM) between two matrices can be easily implemented by composing the conventional SS-based addition and multiplication introduced in Section~\ref{sec:SS}.

% \textbf{Kernel function computation:} In both GPR model construction and prediction stages, the first step is to compute the covariance matrix (i.e., $K_{\mathbf{X}\mathbf{X}}$, $K_{\mathbf{x}_*\mathbf{X}}$, and $k(\mathbf{x}_*, \mathbf{x}_*)$) using a kernel function (e.g., \eqref{kernel}). Here, we will take the SE kernel as an example to show the privacy-preserving kernel function computation. Other commonly used kernel functions (e.g., linear, Mat{\'e}rn-3 and Mat{\'e}rn-5, etc.) can be computed in a similar way.

% In this work, we assume that the hyperparameters (i.e., $\ell$, $\sigma^2_s$, and $\sigma^2_n$) of the GPR model are known a priori and publicly shared between the computing servers. The privacy-preserving optimization of the hyperparameters will be considered in future work. %
% % The kernel function used in this paper is a Gaussian kernel function, and its expression is
% % \begin{equation}
% % 	k(x_i, x_j) = \sigma^2 exp(-\dfrac{\|x_i - xj\|_{2}^2}{2l^2}),
% % \end{equation}
% % where $\sigma$ and $l$ are the hyperparameters that are not privacy data to the data owner.
% %
% We can publicly share the plaintext $-1/(2\ell^2)$ instead of $\ell$ to avoid the SS-based division operation which needs to be approximated and may reduce the accuracy of the kernel computation. Then, the kernel computation only involves SS-based addition/subtraction, multiplication, and exponentiation. Among these operations, the SS-based exponentiation usually needs to be approximated by Taylor expansion, which reduces the efficiency and accuracy of kernel computation. To resolve this issue, we propose a new efficient PP-Exp operation by exploiting the idea of \emph{confusion-correction}, as will be introduced later in Section~\ref{sec:ppexp}. 
% %Other commonly used kernel functions (e.g., linear, Mat{\'e}rn-3 and Mat{\'e}rn-5) can be computed in a similar way.    

% %The calculation process of the kernel function includes multiplication, division and exponentiation. However the computing server can compute $\sigma^2$ and $-\dfrac{1}{2l^2}$ in plaintext. During the privacy-preserving calculation process of the kernel function, the computing server only needs to complete the private calculation of $\|x_i - xj\|_{2}^2$ and $exp(-\dfrac{\|x_i - xj\|_{2}^2}{2l^2})$. These calculations consist only of multiplication and exponentiation, so they can be implemented by calling PPMM and PPExp.

% \textbf{Matrix inversion:} In the conventional GPR model, the inversion of $K_{\mathbf{X}\mathbf{X}}+\sigma^2_n\mathbf{I}$ incurs $\mathcal{O}(n^3)$ time and is usually the efficiency bottleneck of the GPR model construction. Therefore, how to construct an efficient SS-based matrix inversion operation and how much additional computation or communication cost is incurred would be major concerns to a PP-GPR model, which will be discussed in the next section. 

% \begin{figure*}
% 	\centering
% 	\includegraphics[scale=0.4]{fig/Computation-2 (2).pdf} %\vspace{-5mm}
% 	%\fbox{\rule[-.5cm]{0cm}{4cm} %\rule[-.5cm]{4cm}{0cm}}
% 	\caption{The workflow to implement a privacy-preserving model construction and prediction (blue color steps) by PP-GPR.}
% 	\label{fig:flow}
% \end{figure*}

\section{Privacy-preserving operation construction}
\label{sec:ppoperators}

As has been shown in Section~\ref{sec:PP-GPR}, the privacy-preserving exponentiation $\mathcal{F}_{PPExp}$ and matrix inversion $\mathcal{F}_{MatInv}$ are essential for the PP-GPR algorithm. In this section, we will analyze the issues of existing methods for constructing these two operations, introduce the proposed algorithms, and analyze their computational complexity.  

%to protect the privacy in the process of GPR model construction and prediction, it is necessary to build an effective privacy-preserving algorithm to implement matrix inversion and exponential operations. 

% For the current SMPC-based privacy protection algorithm is difficult to apply to GPR, we propose efficient privacy-preserving algorithms, which constitute \MODEL. Specifically, we construct privacy-preserving matrix inversion(PPMI) and privacy-preserving exponentiation(PPExp) based on SS.  

%All algorithms contain online and offline stages, and the offline stage can be completed when the server is idle to improve the efficiency of the online stage of the algorithm.

\subsection{Privacy-preserving exponentiation}\label{sec:ppexp}
%In this section we describe the practical fixed-point privacy-preserving exponential algorithm proposed in this paper.
As aforementioned in Section~\ref{sec:SS}, the exponentiation cannot be constructed directly via additive SS. A commonly-used method to resolve this issue is to approximate the exponentiation using its Taylor expansion $e^u = \sum_{k = 0}^{\infty} \frac{1}{k!} u^k$ such that the exponentiation can be converted into addition and multiplication operations. However, the fact that the exponential grows much faster than the polynomial may lead to large errors in the Taylor series approximation. Although increasing the degree of the polynomial can increase the approximation accuracy, the communication cost will also increase due to the information exchange needed for SS-based multiplication.
%
The work of \citet{knott2021crypten} mitigated this problem via the limit approximation $e^u = \lim_{k \rightarrow \infty} (1+\frac{u}{2^k})^{2^k}$ and exploited the repeated squaring algorithm to iteratively generate polynomials of higher order quickly. However, achieving accurate approximation results with this approach still incurs high communication and computational costs.


% The current design of privacy-preserving exponential algorithms for fixed point numbers consists of two main approaches: polynomial approximation and iterative approximation. A well-known approach in polynomial-based approximations is the use of Taylor expansions to convert exponential to additive and multiplicative. 

%Exponential operations are also commonly used operations in machine learning, e.g., softmax, sigmoid, etc.
% As aforementioned in Section~\ref{sec:SS}, the exponentiation cannot be constructed directly via additive SS. A commonly-used method to resolve this issue is to approximate the exponentiation using its Taylor expansion at $0$ such that the exponentiation can be converted into addition and multiplication operations \cite{knott2021crypten}.
% %Specifically, they use 8-th degree polynomials.
% This approach can only maintain accuracy within a certain range such as the variable close to $0$.
% %However, in many practical applications, the calculation results will exceed the fitting range of the Taylor expansion, resulting in serious loss of accuracy.
% Although increasing the degree of the polynomial can expand the fitting range of Taylor expansion, the communication cost will also increase due to the information exchange needed for additive SS-based multiplication (Section~\ref{sec:SS}).
% %this leads to computational efficiency problems.
% \cite{xia2021STR} proposed to construct an efficient privacy-preserving exponential operation in the real number field.
%
% However, this algorithm has correctness and safety problems when implemented using the computer's integer or floating-point types. For example, due to the uneven scale of the floating-point type, the probability of shares falling to each floating-point number is different during the calculation process, which makes the simulation-based security proof method fail. 




In this work, we propose to construct a \emph{privacy-preserving exponentiation} (PP-Exp) operation by adopting the idea of \textit{confusion-correction}.
In PP-Exp, given a private number $u \in [u_{min}, 0]$, each computing server $S_j$ for $j \in \{0, 1\}$ takes the additive share $[u]_j$ of $u$ as input and then, deduces the additive shares of $e^u$ privately with some random numbers generated by $T$. 
The algorithm includes the following steps: (1) The computing servers mask the share of $u$ with a random value $r$ to obtain $[\![u-r]\!]$; (2) The computing servers jointly reveal the obfuscated value $u-r$; (3) Each computing server uses the obfuscated value to calculate the obfuscated target $e^{u-r}$; and (4) Each computing server corrects the share of $e^{u-r}$ by removing the mask and obtains the share of $e^u$. 
The pseudo-code of the PP-Exp is in Algorithm~\ref{alg-exp}.

%Compared to the conventional SS-based exponentiation approaches mentioned above,    
Note that Algorithm~\ref{alg-exp} considers only negative input $u$ since the commonly used kernel function (e.g., \eqref{kernel}) of GPR involves only exponentiation of negative values. According to the range of $u$, the proposed PP-Exp can achieve correctness and security by selecting appropriate $[-\check{r}_{max}, \check{r}_{max})$ and $l_f$ as will be discussed later. 

%Next, we will theoretically analyze the correctness and security of this proposed $\mathcal{F}_{PPExp}$ algorithm.



% %New scheme************************************
% A secure fixed-point indexing protocol on integer rings is proposed in the literature~\cite{kelkar2021secure}, where the online phase only costs 1 round of communication.However, this protocol still has some problems in terms of efficiency and accuracy. Specifically, the protocol has an error probability with value $\frac{2^{l_x+1}}{2^l}$.A reasonable error probability is guaranteed only when the difference between the values of $l$ and $l_x$ is large. For example, as described in~\cite{kelkar2021secure}, when the precision value $l_f = 15$, $l = 81$ is needed to obtain the error probability $\frac{1}{2^{40}}$. The 32-bit int data type or the 64-bit long data type is usually used for training and prediction in machine learning.
% In this case, the error probability of the algorithm increases dramatically, leading to training or prediction failure. 

% Aiming at the above problems, this paper proposes a secure fixed-point exponential algorithm with no error probability under the setting of three servers. Specifically, this paper reduces the communication and calculation overhead of the offline stage of the secure fixed-point index protocol by adding an additional trusted auxiliary server to generate the random numbers required by the online stage of the protocol. At the same time, this paper proposes a secure share correction protocol to solve the problem of probability error existing in the current secure fixed-point number protocol. The core idea of the security modification protocol is that the computing server makes the share overflow through local adjustment. A detailed description of the security modification protocol is given in Algorithm \ref{alg-sm}.
% \begin{algorithm}[H]
% 	\caption{Share Modification}
% 	\label{alg-sm}
%     \begin{algorithmic}
%      \Statex \textbf{Input:}$S_j$ holds the share $[u]_j\in Z_L$; 
%      \Statex \textbf{Output:} $S_j$ gets the share $[\hat{u}]_j\in Z_L$, with $([\hat{u}]_0+[\hat{u}]_1)\mod L = u$ and $Wrap([\hat{u}]_0, [\hat{u}]_1, L) = 1$.
%     \If{$[u]_0 \geq 2^{l_x}$} 
%       \State Set $[\hat{u}]_j = [u]_j$;
%     \Else 
%       \State $S_0$ sets $[\hat{u}]_0 = [u]_0 + 2^{l-1}$, and sends a signal to $S_1$;
%       \State $S_1$ sets $[\hat{u}]_1 = [u]_1 - 2^{l-1}$.
%     \EndIf
%     \end{algorithmic}
% \end{algorithm}

% \begin{theorem}
% 	\label{theorem-SM}
%     For a fixed number of points $x, x\in [0,2^{l_x})$ whose share on $Z_L$ is $[x]_0,[x]_1$, if $[x]_0 \in [2^{l_x}, 2^l)$ then we have $Wrap([x]_0,[x]_1, L) = 1$.
% \end{theorem}

% Based on the modified Share algorithm, an error-free secure fixed-point exponential algorithm can be constructed. The pseudo-code of it is shown in Algorithm~\ref{FPExp} which satisfies Theorem~\ref{theorem-exp}.

% \begin{algorithm}[H]
% 	\caption{Secure fixed-point exponential(SFP-EXP)}
% 	\label{FPExp}
%     \begin{algorithmic}
%      \Statex \textbf{Input:}$S_j$ holds the share $[u]_j\in Z_L$; 
%      \Statex \textbf{Output:} $S_j$ gets the share $[e^{u}]_j\in Z_L$.
%       \State Let $[u^{\prime}]_j \leftarrow PubFPMult([u]_j, \Hat{(log_2(b)))} $;
%       \Comment{Convert to base $2$ exponentiation.}
%       \State Let $z = FPAdd([u^{\prime}]_j, \hat{A}$);
%         \Comment{Make exponent $> 1$.}
%       \State Let $(z^{int}_i, z^{frac}_i) = (\lfloor[z]_i/2^{l_f}\rfloor, [z]_i-\lfloor[z]_i/2^{l_f}\rfloor)$;
%       \Comment{Split into integer and fractional parts.}
%       \State Let $(z^{int}_0, z^{int}_1)\leftarrow RingChage((z^{int}_0, z^{int}_1), F_{q-1})$; 
%       \Comment{RingChange from $Z_{2^{l-l_f}}$ to $Z_{q-1}$.}
%       \State Let $(v^{int}_i, v^{frac}_i)\leftarrow (2^{z^{int}_i}\mod q, 2^{z^{frac}_i}))$; 
%       \Comment{Exponentiate both parts.}
%       \State Let $v_i=v^{int}_i*2^{l_f}v^{frac}_i\mod q$; 
%       \Comment{Get each party’s local share.}
%       \State Let $[e^{u^{\prime}}]^p_0,  [e^{u^{\prime}}]^p_1\leftarrow MTA(v_0, v_1, F_q)$;
%       \Comment{Convert to additive shares in $F_q$.}
%       \State Let $[e^{u}]^p_i\leftarrow FPDiv([e^{u^{\prime}}]^p_i, 2^{A+l_f})$;
%       \Comment{Divide by the remaining factor.}
%       \State Let $[e^{u}]_i\leftarrow RingChange([e^{u^{\prime}}]^p_i, 2^l)$;
%       \Comment{RingChange from $F_q$ to $Z_{2^l}$.}      
%     \end{algorithmic}
% \end{algorithm}

\begin{algorithm}[t]\footnotesize
	\caption{Privacy-preserving exponentiation ($\mathcal{F}_{PPExp}$)}
	\label{alg-exp}
    \textbf{Setup.} The servers determine $\mathcal{Z}_L$,  $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$, the range $[u_{min}, 0]$ of input $u$, and the range $[-\check{r}_{max}, \check{r}_{max})$.\\
    \textbf{Input.} $S_0$ holds the share $[u]_0$; $S_1$ holds the share $[u]_1$.
	\begin{algorithmic}[1]
        \State // \textbf{Offline phase executed on assistant server $T$:}
        %\State \quad Draw $r \in [2^{l_f}(l_f \cdot ln^2 + u_{min}))$ randomly
        \State \quad Draw $\check{r}$ in the range $[-\check{r}_{max}, \check{r}_{max})$ randomly
        \State \quad $r \leftarrow \check{r} \cdot 2^{l_f}$
        \Comment $\check{r} \in \mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
        \State \quad  Generate $([r]_0$, $[r]_1) \in \mathcal{Z}_L$
        %\State \quad $\check{r} \leftarrow r/ 2^{l_f}$
        %\Comment $\check{r} \in \mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
		\State \quad Calculate $e^{-\check{r}}$ in $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
		%\Comment $e^{-\check{r}} \in \mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
		%
		\State \quad Generate $([e^{-\check{r}}]_0$, $[e^{-\check{r}}]_1) \in \mathcal{Z}_L$
		\State \quad  Send $[r]_j$ and $[e^{-\check{r}}]_j $ to $S_j$ for $j \in \{0, 1\}$
		\State // \textbf{Online phase:}
		\State \quad $S_j$ calculates $[d]_j \leftarrow [u]_j + [r]_j$ for $j \in \{0, 1\}$
		\State \quad $S_0$ and $S_1$ sends $[d]_0$ and $[d]_1$ to each other
		\State \quad $d \leftarrow Rec([d]_0 + [d]_1)$ \label{line:rec}
		\Comment Executed by both $S_0$ and $S_1$
		\State \quad $\check{d} \leftarrow d/2^{l_f}$
		\Comment Executed by both $S_0$ and $S_1$
		\State \quad Calculate $e^{\check{d}}$ in $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
		%\Comment $e^{\check{d}} \in \mathcal{Q}_{<\mathcal{Z}_L,l_f>}$
		\State \quad $S_j$ calculates $[e^u]_j \leftarrow e^{\check{d}} \cdot 2^{l_f} \cdot [e^{-\check{r}}]_j$ for $j \in \{0, 1\}$
	\end{algorithmic}
	\textbf{Output.} $S_j$ outputs the share $[e^u]_j$ for $j \in \{0, 1\}$.
\end{algorithm}

Firstly, to guarantee the correctness of the proposed PP-Exp algorithm, we need to make sure that all the fixed-point calculations (i.e., Line 5 and Lines 13-14) can not overflow or underflow. Consequently, the selected $[-\check{r}_{max}, \check{r}_{max})$ and $l_f$ need to satisfy the following relationship.
%
% We first analyze the correctness of the algorithm.
% The PP-Exp algorithm runs in the ring of integers $\mathcal{Z}_L$ and the fixed point set $\mathcal{Q}_{<\mathcal{Z}_L,l_f>}$. Let $u_{min}$ be the minimum value of the input $u$.  Let $-\check{r}_{max}, -\check{r}_{max}$ be the maximum and minimum values of $\check{r}$. The input $u$ and the random number $\check{r}$ take values in the ranges $[u_{min}, 0]$ and $[-\check{r}_{max}, \check{r}_{max}]$ respectively.
%The correctness of the PP-Exp algorithm satisfies the theorem~\ref{theorem-exp}. See the Appendix for a detailed analysis of Theorem~\ref{theorem-exp}. 
%
\begin{theorem}[\textbf{Correctness}] 
\label{theorem-exp}
For any number $u$ in the range $[u_{min},0]$, if $  (\check{r}_{max} - u_{min}) \log_2^e \leq l_f < \frac{l-1}{2}$,  the PP-Exp algorithm can correctly derive $([e^u]_0, [e^u]_1)$ from $([u]_0, [u]_1)$, satisfying $[e^u]_0+[e^u]_1= e^u$. 
\end{theorem}
%
For example, on the ring of integers $\mathcal{Z}_{2^{64}}$, assuming that the input $u$ takes values in the range $[-4,0]$ and $\check{r}$ takes values in the range $[-16, 16)$. By setting $l_f=29$, the correctness of the PP-Exp algorithm can be ensured.
See Appendix A for the proof.  

% In addition to this, the operation in privacy-preserving Gaussian process regression algorithm leading to overflow also includes matrix multiplication. It is assumed that the intermediate results of the matrix multiplication are all less than $2^{k}$. To make the calculation result not overflow, it is necessary to ensure that $2^{k+2l_f} < 2^{l-1}$, i.e. $l_f \leq \frac{l-k}{2}$. 


Next we will analyze the security of the PP-Exp algorithm. In Algorithm~\ref{alg-exp}, Line~\ref{line:rec} is the only step that will reconstruct $d$ in the fixed-point domain and has the risk of leaking the information of $u$.
%
Specifically, given the maximal range of $u$ and $\check{r}$ (i.e., $[u_{min}, 0]$ and $[-\check{r}_{max}, \check{r}_{max})$), the value of $d$ may be exploited to reduce the feasible range of $u$, which is an information leakage.
For example, supposing $u \in \{-2, -1, 0\}$, $r \in \{-1, 0, 1\}$, and $d = u+r$, one can infer that $u$ must be $-2$ or $-1$ if $d = -2$.

To formally analyze the amount of privacy leaked, we define the \emph{degree of information leakage} as follows:
\begin{definition}
Supposing $u$ is known to be an element of a finite set $\mathcal{U}$, the \emph{degree of information leakage} on $u$ is $\frac{1}{|\mathcal{U}|}$. 
\end{definition}%\vspace{-1mm}
%
We consider an algorithm to be \emph{secure} if the \emph{degree of information leakage} of the input remains constant during the algorithm. Given a fixed precision $l_f$, let $m_u$ and $m_r$ be the amount of fixed-point numbers that can be represented in $[u_{min}, 0]$ and $[-\check{r}_{max}, \check{r}_{max})$, respectively.
%Suppose the number of values of $u$ is $m_u$ and the number of values of $r$ is $m_r$.
The security of the PP-Exp algorithm satisfies the following theorem. 
%
\begin{theorem}[\textbf{Security}]
\label{theorem-exp-sec}
For any fixed number $u$ in the range $[u_{min}, 0]$,
%$[-(ln^{2}l_f - \check{r}_{max}),0]$,
the PP-Exp algorithm is secure with the probability $\frac{m_r-m_u+1}{m_r}$. The expected degree of information leakage on $u$ is $\frac{m_u + m_r -1}{m_u\cdot m_r}$.
\end{theorem}%\vspace{-1mm}
See Appendix A for the proof.
%We analyze the steps in PP-Exp where privacy leaks occur, the probability of privacy leaks, and the amount of privacy available at the time of leakage. Specifically, in the offline phase, $T$ locally generates random numbers that satisfy the conditions and sends their shares to the computational server, so there is no privacy leakage. In the online phase, only in step 10 does the computation server perform a message pass to reconstruct $d$. Next, we analyze the probability and amount of privacy loss of the input $u$ due to the reconstruction of $d$. 
%
%An important result is that $d=u\cdot 2^{l_f} + r$ causes the probability of $u$ leaking and the amount of privacy leaked to be related to the number of values taken for $u$ and $r$. Specifically,
Theorem~\ref{theorem-exp-sec} shows that the more the number of values of $r$ is greater than the number of values of $u$ (i.e., $m_r - m_u$ is larger), the PP-Exp has a larger probability to be secure.
Selecting a larger range of $\check{r}$ (i.e., larger $\check{r}_{max})$ can significantly reduce the degree of information leakage. However, a larger $\check{r}_{max}$ may result in larger $l_f$ and larger $l$ due to the results in Theorem~\ref{theorem-exp} and consequently, increase the amount of communication cost as will be shown in Section~\ref{sec:comp}. 

%Suppose the number of values of $u$ is $m_u$ and the number of values of $r$ is $m_r$, then the probability that the algorithm leads to privacy leakage of input $u$ is $\frac{m_u-1}{m_r}$. We describe the amount of privacy compromised by the PP-Exp algorithm in terms of the \textit{degree of information leakage} of the input $u$. When only the range of $u$ is known, the information leakage degree of $u$ is $\frac{1}{m_u}$. Since in PP-Exp $d$ additionally leaks the input range of $u$, this leads to an increase in the degree of information leakage of $u$. Specifically, in the PP-Exp the averaged degree of information leakage of $u$ is $\frac{2(1-\frac{1}{m_u})}{m_r}$.

Note that the PP-Exp algorithm leaks the exact value of $u$ with probability only $\frac{2}{m_um_r}$, i.e., both $u$ and $r$ take the maximum or minimum value of the range in which they are located.
%The detailed analysis process can be found in Appendix A. 
%
For example, when $l_f = 29$, supposing the input $u$ takes values in the range $[-4,0]$ and $r$ takes values in the range $[-16,16)$, we have $m_u = 2^{31}+1$ and $m_r = 2^{34}$.
The PP-Exp is secure with $\frac{7}{8}$ probability. The degree of information leakage is  $\frac{9}{2^{34}+8}$, an increase of $\frac{1}{2^{34}+8}$ over the secure one. The probability of revealing a particular value of $u$ is less than $\frac{1}{2^{64}}$. 

% Note that for reasons of algorithmic correctness and security, PP-Exp only considers the case where the input is negative. Therefore, some generality is lost compared to existing approximations such as Taylor expansions. However, it is more efficient and accurate, and is suitable for Gaussian process regression scenarios.

\subsection{Privacy-preserving matrix inversion}\label{sec:ppmi}

In existing work \citep{knott2021crypten}, the matrix inversion is approximated via \emph{Newton-Raphson} iteration which is a local optimization method such that its performance highly depends on the initial value of the algorithm. 
However, in the state of SS, we cannot know any information about the original input matrix such that it is difficult to find the initial inverted matrix that satisfies the convergence condition.
%Later, we will empirically show that random selection of initial matrices usually fails to converge to the true matrix inversion result.
%, so this method is difficult to apply to our scenario.
% Inspired by the \emph{confusion-correction} idea used in Algorithm~\ref{alg-exp}, one may be tempted to design a \emph{privacy-preserving matrix inversion} (PP-MI) algorithm in a similar way, which has already been considered by \cite{xia2021STR}. In this work, they proposed an efficient privacy-preserving matrix inversion algorithm in the \emph{real number field}. To invert a matrix $\mathbf{U}$, the main idea of \cite{xia2021STR} is to randomly select a random matrix $\mathbf{R}$ from the real number field for masking $\mathbf{U}$, calculate the inverse matrix
%\footnote{$R \times U$ represents an element-wise multiplication between the two matrices.} $(\mathbf{U}\times \mathbf{R})^{-1}$ in plaintext, and then eliminate $\mathbf{R}^{-1}$ by the share of $\mathbf{R}$ to achieve the share of $\mathbf{U}^{-1}$. The security of the above steps requires $\mathbf{U}\times \mathbf{R}$ to completely cover up all the information of $\mathbf{U}$. However, in the integer ring $\mathcal{Z}_L$, if there are irreversible elements in $\mathbf{U}$, $\mathbf{U} \times \mathbf{R}$ is \emph{not} randomly and uniformly distributed in $\mathcal{Z}^{N \times N}_L$ (but $\mathbf{U}+\mathbf{R}$ is). In this case, the privacy of $\mathbf{U}$ will be leaked. 

Note that $\mathbf{K}+\sigma_n^2\mathbf{I}$ is a positive definite matrix whose inversion can be computed via Cholesky decomposition $\mathbf{K}+\sigma_n^2\mathbf{I} = \mathbf{L D L}^\top$ where $\mathbf{L}$ is a lower triangular matrix and $\mathbf{D}$ is a diagonal matrix. To theoretically guarantee the security of a PP-MI algorithm, we choose to go into the Cholesky decomposition algorithm and convert all the operations to their corresponding SS-based version. Since the entire process of computing $\mathbf{L}$, $\mathbf{D}$, and their inverse involves only addition, multiplication, and division between matrix elements, we can exploit the existing SS-based addition, multiplication, division, and their composability \citep{canetti2001universally} for constructing a PP-MI algorithm.      
%
%We observe that the covariance matrix in GPR is a symmetric positive definite matrix, so it can be decomposed by Cholesky decomposition, and its inversion process can be converted into addition, multiplication and division between matrix elements. By invoking the corresponding privacy-preserving algorithms of these operations properly, we can achieve the PPMI of the covariance matrix in GPR.
In the PP-MI, each computing server $S_j$ for $j \in \{0, 1\}$ takes one additive share $[\mathbf{U}]_j$ of matrix $\mathbf{U} \in Z^{n \times n}_{L}$ as input and deduces the share of $\mathbf{U}^{-1}$ privately.
The detailed steps and pseudo-code of the PP-MI are shown in Appendix B. 
%
%This work represents a pioneering implementation of SS-based PP-MI via Cholesky decomposition.

To the best of our knowledge, this is the first work that implements the SS-based PP-MI via Cholesky decomposition.
A rigorous analysis of its communication cost is detailed in Section~\ref{sec:comp}. Additionally, the performance of the proposed approach is empirically demonstrated and evaluated in Section~\ref{sec:experi-ops}. 

%Although converting the matrix inversion algorithm into PP-MI seems to be straightforward. The communication cost and precision loss due to SS are unknown. This is the first work that implements the SS-based Cholesky decomposition, formally analyzes its communication cost (Section~\ref{sec:comp}), and empirically shows its performance (Section~\ref{sec:experi-ops}).   

% \begin{theorem}
% 	\label{theorem-mi}
% 	Given a positive definite matrix $U \in Z^{n \times n}_{L}$, the PP-MI algorithm securely derives $([U^{-1}]_0, [U^{-1}]_1)$ from $([U]_0, [U]_1)$, satisfying $[U^{-1}]_0+[U^{-1}]_1= U^{-1}$.
% \end{theorem}

% The security of Theorem~\ref{theorem-exp}\&~\ref{theorem-mi} can be shown by adopting the \textit{simulation-based} method~\cite{Lindell2017simulate}.
% %That is, it is proved that there is a PPT simulator $Sim$ that can simulate the view of the computing server, making it impossible for a semi-honest adversary to distinguish the simulated view from the real view.
% See Appendix~\ref{app:proof} for the rigorous proofs.

\subsection{Communication complexity}\label{sec:comp}
In this section, we analyze the theoretical number of communication rounds and communication volume of the proposed PP-Exp and PP-MI.
%
We assume that the assistant server $T$ has generated enough random numbers for the PP-GPR calculation process in the offline stage
%(i.e.,Beaver-triples and $(r, e^{r})$)
and sent their shares to the corresponding computing server a priori. 

% PP-Exp
To execute PP-Exp for an $n$-dimensional vector $\mathbf{u}$, Algorithm~\ref{alg-exp} only requires $1$ round of communication between the computing servers in Line~\ref{line:rec} of Algorithm~\ref{alg-exp}
%, which is used to reconstruct $\mathbf{d}$, 
and the amount of this communication is $2nl$. 

% PP-MI
In PP-MI, we decompose the matrix inversion into addition, multiplication, and division operations by exploiting the Cholesky decomposition.
%By leveraging this decomposition technique, the number of communication rounds required for the PP-MI algorithm can be determined as $22n - 6$ and the total communication volume of the PP-MI algorithm is $\mathcal{O}(n^3l)$. See Appendix C for detail.
%
A single-element SS-based multiplication requires $1$ rounds of bidirectional communication with a traffic volume of $2l$. Implementing the division of a single element by invoking the privacy-preserving division provided in Crypten takes $17$ rounds of communication and a communication volume of $\mathcal{O}(l)$.
%The division is implemented in Crypten\citep{knott2021crypten}.
%For the specific implementation process, please refer to \cite{knott2021crypten}.
%With the iteration number to be $5$, the SS-based division of a single element requires $17$ rounds of communication and the communication complexity is $\mathcal{O}(l)$. 
%Therefore, given an $n\times n$ positive definite matrix, the total number of communication rounds of PP-MI is $23n+9$, and the communication volume is $\mathcal{O}(n^{3}l)$.
The PP-MI of an $n\times n$ positive definite matrix includes $n$ rounds of division and $5n - 6$ rounds of multiplication.
Thus, the total number of communication rounds for PP-MI is $17n+(5n-6) = 22n -6$.
In the PP-MI, it is necessary to perform $\mathcal{O}(n^3)$ element multiplication and $\mathcal{O}(n^2)$ element division. Therefore, the total communication volume of the PP-MI algorithm is $\mathcal{O}(n^3l)$. See Appendix C for detail.


\section{Experiments and discussion}\label{sec:experi}

This section empirically evaluates the performance of the proposed privacy-preserving operations and PP-GPR. The PP-Exp, PP-MI, and PP-GPR are built upon the open-source privacy-preserving ML framework Crypten \citep{knott2021crypten}. We set $l = 64$ and $l_f = 26$.
%
The experiments are carried out on three servers with a 48-core Intel Xeon CPU running at 2.9GHz and a local area network with a communication latency of 0.2ms and bandwidth of 625MBps.

%We need to thank Crypten\cite{Newton-Raphson iterations}, an open source framework for privacy-preserving machine learning, which provides currently commonly used privacy-preserving algorithms, allowing us to quickly implement our proposed algorithm on it.

% \begin{figure} [ht]
% \vskip 0in
% 	\centering
% 	\includegraphics[width=3.2in]{fig/Exp_loss_poly.png}
%  \\
%  \includegraphics[width=3.2in]{fig/Exp_loss_crypten.png} 
% 	\caption{Graphs of (a) comparison with polynomial approximation methods
% ; and (b)comparison with the iterative approximation method in Crypten.}
% 	\label{Fig.PP-Exp} 
%  \vskip -0.1in
% \end{figure}

\begin{figure}[t]
	\centering
	\begin{tabular}{cc}
		\hspace{-7mm}\includegraphics[scale=0.111]{fig/PP-Exp_polynomial.pdf} & \hspace{-1mm}\includegraphics[scale=0.111]{fig/PP-Exp_crypten.pdf} \\
		\hspace{-3mm}(a) PP-Exp vs. Polynomial
& (b) PP-Exp vs. Iterative\\
	\end{tabular}%\vspace{-1mm}
	%\includegraphics[scale=0.3]{ppexp.pdf}
	%\fbox{\rule[-.5cm]{0cm}{4cm} %\rule[-.5cm]{4cm}{0cm}}
	\caption{Graphs of PP-Exp vs. (a) polynomial approximation methods; and (b) the iterative approximation method.}%\vspace{-1mm}
	\label{Fig.PP-Exp}
\end{figure}

\subsection{Evaluation of PP-Exp and PP-MI}
\label{sec:experi-ops}
We first demonstrate the accuracy and computational efficiency of the proposed operations.


% We compare the performance and efficiency of the proposed PP-Exp against that of (a) \emph{Plaintext}: the conventional exponentiation operation; (b) \emph{Crypten}: approximation using the limit definition of the exponential function in the Crypten library \cite{knott2021crypten} with a degree of $8$; and (c) \emph{Poly}: The Taylor expansion approximation with a polynomial degree of $6$.  %We first compare the accuracy of PP-Exp with the algorithm using approximate calculation, and the comparison results are shown in Figure \ref{Fig.main2}.
% %
% As shown in Fig.~\ref{Fig.main2}a, the proposed PP-Exp achieves almost the same results as the plaintext and outperforms other tested algorithms especially when $u$ is large.

%Figure \ref{Fig.main2} has two sub figures, as shown in fig. \ref{Fig.sub.1}, the approximate calculation and Crypten provided algorithm give acceptable result of $e^x$ when $x$ is less than 7, but the loss of precision grows as x increases. Only PP-Exp still maintains accuracy with the growth of x.

\begin{table} \footnotesize
	\centering
	%\fontsize{8}{10}\selectfont    %{字体尺寸}{行距}
	\caption{The computational time (in the unit of second) incurred by the tested approaches in computing $e^\mathbf{U}$ with varying size of $\mathbf{U}$.} 
	%Time comparison of the PP-Exp. The time unit is the second. The number of polynomials is 10. The number of iterations for Crypten is 8.}
	\begin{tabular}{l|cccc}
		\toprule
		\diagbox [width=6em,trim=l] {Approach}{Size of U} & $1000^2$ & $3000^2$ & $5000^2$  & $10000^2$ \\
		\hline
		Plaintext & 0.008 & 0.014 & 0.024 & 0.064   \\
		Poly\_10 & 3.308 & 20.956 & 56.462 & 221.582  \\
		Crypten\_8 & 2.557 & 11.701 & 29.528 & 113.136  \\
		PP-Exp & $\mathbf{0.208}$ & $\mathbf{0.457}$ & $\mathbf{0.935}$ & $\mathbf{3.006}$\\
		\bottomrule
	\end{tabular}%\vspace{0cm}
	\label{tab:PP-Exp}
\end{table}

\textbf{Evaluation of PP-Exp:} 
We compare the performance of the proposed PP-Exp against that of (a) \emph{Plaintext}: the conventional exponential operation; (b) \emph{Poly}: a polynomial approximation based on Taylor expansions; and (c) \emph{Crypten}: an iterative approach based on the limit approximation of the exponential function. 
%
We evaluate the accuracy and efficiency of the PP-Exp algorithm separately. For accuracy, Poly with varying degrees of polynomials and Crypten with a varying number of iterations are tested.
%Specifically, when comparing polynomials, we used polynomials numbered 6, 8 and 10 as benchmarks to compare performance. For the Crypten comparison, we compared results with Crypten for 4, 6 and 8 iterations. The specific performance comparison results are shown in Fig.~\ref{Fig.PP-Exp}.
$Loss_{e_u}$ is the difference between the tested algorithm and $e_u$ computed in Plaintext. As shown in Fig.~\ref{Fig.PP-Exp}, the proposed PP-Exp achieves almost the same results as the plaintext and outperforms other tested algorithms.
%especially when $u$ is small.
%
Next, we compare the efficiency of PP-Exp, Poly (with polynomial degree 10), and Crypten (with 8 iterations) on varying sizes of input variables (i.e., tested on $e^{\mathbf{U}}$ with varying sizes of $\mathbf{U}$).
%
The computational time of the tested algorithms is shown in Table~\ref{tab:PP-Exp}.
%running times on different dimensional inputs. The specific efficiency comparison results are shown in Table~\ref{tab:PP-Exp}. The results in Table~\ref{tab:PP-Exp} show that
As can be seen, the PP-Exp incurs significantly less time than both the polynomial and iterative approaches. In particular, PP-Exp is at most 70 times faster than Poly\_10 and 38 times faster than Crypten\_8 given a large size of inputs.


\textbf{Evaluation of PP-MI:} The performance of PP-MI is tested by generating random covariance matrices and compared against that of (a) \emph{Plaintext-Cholesky}: Matrix inversion via Cholesky decomposition; and (b) \emph{Plaintext-inv}: The inv function in the torch.linalg library. 
%
Specifically, we first randomly sample an input matrix $\mathbf{X} \in [-10, 10]^{n\times d}$ with $d = 2$ and then compute $\mathbf{K}+\sigma^2_n\mathbf{I}$ using \eqref{kernel} with $\sigma_s^2 = 1$, $\ell = 1$, and $\sigma_n^2 = 0.1$. Let $\mathbf{\Lambda}$ be the output of a matrix inversion algorithm. $Loss_\text{MI} \triangleq ||(\mathbf{K}+\sigma^2_n\mathbf{I})\mathbf{\Lambda} - \mathbf{I}||^2_2$
%the TODO distance between $(K_{XX}+\sigma^2_nI)\Lambda$ and the identity matrix $I$
is used as the inversion accuracy metric.
%
The $Loss_\text{MI}$ and wall-clock time of the tested algorithms averaged over 10 independent runs with varying $n$ are shown in Fig.~\ref{Fig.main2}. The error bars are computed in the form of standard deviation.
%
As can be seen, PP-MI incurs an acceptable level of accuracy loss (around $0.0001$ for $n = 400$) with acceptable computational cost.  %compared to the standard matrix inversion implementation. 
This loss comes from the approximation of the SS-based division and the fixed-point encoding steps which cannot be avoid in most SS-based algorithms.


% \begin{figure} [ht]
% \vskip 0in
% 	\centering
% 	\includegraphics[width=3.2in]{fig/MI-d (1).pdf}
%  \\
%  \includegraphics[width=3.2in]{fig/MI-t (1).pdf} 
% 	\caption{Graphs of (a) Loss of different matrix inversion algorithms vs. dimension of matrix; (b) computational time of different matrix inversion algorithms vs. dimension of matrix.}
% 	\label{Fig.main2} 
%  \vskip -0.1in
% \end{figure}


\begin{figure}[t]
	\centering
	\begin{tabular}{cc}		\hspace{-3mm}\includegraphics[scale=0.100]{fig/PPMI-Loss.pdf} &
\hspace{-2mm}\includegraphics[scale=0.072]{fig/ppmi_time_multi.pdf}\\
\hspace{-3mm} (a) PP-MI Loss & \hspace{-3mm}(b) PP-MI Time\\
	\end{tabular} %\vspace{-1mm}
	\caption{Graphs of (a) Losses and (b) Computational time of different MI algorithms vs. dimension of matrix.} %\vspace{-1mm}
	\label{Fig.main2}
\end{figure}


\begin{table*} \footnotesize
\centering
\caption{Evaluation results of GPR and PP-GPR using RBF kernel with varying sizes of observations and test inputs.}
%The $std.$ represents standard deviation. }
\begin{tabular}{c|c|c|c|c|c|c}
\toprule
          & \multicolumn{2}{c|}{ Dataset Size} & $Loss_\mu$ & $Loss_{\sigma^2}$ & \multicolumn{2}{c}{Time (s)}  \\ 
 \cline{2-3}  \cline{6-7}  &&&&&\\[-0.8em]
          & $n$  & Test  &  $mean (std.)$                           & $mean (std.)$ & GPR & PP-GPR      \\ 
\hline &&&&&&\\[-0.8em]
          & 80        &  20             & 0.0005\%($\pm$7.5e-05)                & 0.0141\%($\pm$3.7e-03)         & 0.028 &7.068        \\
Traffic & 150     & 50      & 0.0027\%($\pm$1.0e-02)             & 0.0061\%($\pm$4.9e-04)         & 0.089    & 13.016       \\
          & 300      & 100      & 0.0057\%($\pm$3.3e-03)         & 0.0852\%($\pm$1.2e-02)        & 0.149   & 32.355      \\ 
\hline &&&&&&\\[-0.8em]
        & 80         & 20              &  0.0007\%($\pm$3.6e-04)             & 0.0095\%($\pm$2.2e-03)       & 0.025   & 7.024       \\
Diabetes & 150        & 50               & 0.0018\%($\pm$1.3e-03)              & 0.0059\%($\pm$1.2e-03)         & 0.104    & 13.901        \\
          & 300         & 142       & 0.0058\%($\pm$2.5e-03)                 & 0.0848\%($\pm$5.6e-03)        & 0.671    & 97.076        \\
\bottomrule
\end{tabular}\label{tab:real}
\end{table*}

\begin{table*} \footnotesize
\centering
\caption{Evaluation results of GPR and PP-GPR using Mat{\'e}rn kernel with varying sizes of observations and test inputs.}
%The $std.$ represents standard deviation. }
\begin{tabular}{c|c|c|c|c|c|c}
\toprule
          & \multicolumn{2}{c|}{ Dataset Size} & $Loss_\mu$ & $Loss_{\sigma^2}$ & \multicolumn{2}{c}{Time (s)}  \\ 
 \cline{2-3}  \cline{6-7}  &&&&&\\[-0.8em]
          & $n$  & Test  &  $mean (std.)$                           & $mean (std.)$ & GPR & PP-GPR      \\ 
\hline &&&&&&\\[-0.8em]
          & 80        &  20             &  
          0.2711\%($\pm$2.3e-02)             & 0.0221\%($\pm$5.5e-06)       & 0.032 & 9.732      \\
Traffic & 150     & 50      & 0.2665\%($\pm$6.1e-03)     & 0.0241\%($\pm$1.2e-06)         & 0.078       & 13.116  \\
          & 300      & 100      & 0.8288\%($\pm$1.5e-02)   & 0.0257\%($\pm$2.0e-06) &   0.153  & 35.33  \\ 
\hline &&&&&&\\[-0.8em]
        & 80         & 20              &   0.0548\%($\pm$ 1.0e-03)& 0.0193\%($\pm$4.0e-06) & 0.023  &  8.027        \\
Diabetes & 150        & 50  &              0.0424\%($\pm$6.2e-05) & 0.0236\%($\pm$ 7.6e-06)& 0.102   & 15.702      \\
          & 300         & 142       & 0.0545\%($\pm$4.9e-06)& 0.0343\%($\pm$ 3.0e-06)& 0.589   & 99.082 \\
\bottomrule
\end{tabular}\label{tab:matern}
\end{table*}


\subsection{Evaluation of PP-GPR}
This section empirically evaluates the performance of the proposed PP-GPR on two 
%This section describes our experiments to verify the accuracy and efficiency of privacy-preserving Gaussian process algorithms on two
real-world datasets: (a) \emph{Traffic} dataset \citep{chen2015gaussian} contains taxi demand information of 2506 regions in a city between 9:30 p.m. and 10 p.m. on August 2, 2010; and (b) \emph{Diabetes} dataset (under BSD License) \citep{efron2004least} contains diabetes progression of 442 diabetes patients with 10 input features.
We test the proposed PP-GPR with both the SE kernel \eqref{kernel} and the Mat{\'e}rn$_{3/2}$ kernel:
%
\begin{equation*}
k(\mathbf{x}, \mathbf{x}') \triangleq \sigma^2_s(1 + \sqrt{3d(\mathbf{x}, \mathbf{x}')}/l)\text{exp}(-\sqrt{3d(\mathbf{x}, \mathbf{x}'})/l)
\end{equation*}
%
In the diabetes experiments, we use $\sigma^2_s = 0.8 $, $\sigma^2_n = 0.1 $, and $\ell = 0.23$ for the SE kernel and $\sigma^2_s = 0.1 $, $\sigma^2_n = 0.1 $, and $\ell = 1.0$ for the Mat{\'e}rn$_{3/2}$ kernel. As the traffic dataset suggested, we set $\sigma^2_s = 0.1 $, $\sigma^2_n = 0.1 $, and $\ell = 1.0$ for all the traffic experiments. 
%
%In both real-world scenarios, privacy-preserving technology is necessary to protect data privacy. In the traffic application, the trajectory and demand of taxis are usually commercial information that cannot be released directly. For medical scenarios, the personal information of diabetic patients is private data and needs to be strictly protected according to ethical and medical guidelines.

Let $\mathcal{X_*}$ be a set of test inputs, $\mu_{\mathbf{x}_*|\mathcal{D}}$ ($\sigma^2_{\mathbf{x}_*|\mathcal{D}}$) and $\Tilde{\mu}_{\mathbf{x}_*|\mathcal{D}}$ ($\Tilde{\sigma}^2_{\mathbf{x}_*|\mathcal{D}}$) be, respectively, the predictive mean (variance) of the GPR and PP-GPR. The relative difference between the predictive results of GPR and PP-GPR is used as the performance metric: $Loss_\mu \triangleq |\mathcal{X}_*|^{-1}\sum_{\mathbf{x}_* \in \mathcal{X}_*}(|\mu_{\mathbf{x}_*|\mathcal{D}} - \Tilde{\mu}_{\mathbf{x}_*|\mathcal{D}}|/\mu_{\mathbf{x}_*|\mathcal{D}})$. $Loss_{\sigma^2}$ is computed in a similar way. 
%
To test the performance of PP-GPR in different data scales, we randomly sample observations and test data from each dataset with varying $n$ and $|\mathcal{X}_*|$.
%
%\subsubsection{PP-GPR vs. Conventional GPR}
% For the SE kernel, we use $\sigma^2_s = 0.8 $, $\sigma^2_n = 0.1 $, and $\ell = 0.23$ for the diabetes dataset and, as the traffic dataset suggested, use $\sigma^2_s = 0.7969$, $\sigma^2_n = 0.1$, and $\ell = 0.6276$ for traffic experiments.
% For the Mat{\'e}rn$_{3/2}$ kernel, we set $\sigma^2_s = 0.1 $, $\sigma^2_n = 0.1 $, and $\ell = 1.0$ for the diabetes dataset and, as the traffic dataset suggested, use $\sigma^2_s = 0.7969$, $\sigma^2_n = 0.1$, and $\ell = 0.6276$ for traffic experiments.
The loss of the predictive results and the wall-clock execution time (including both computation and communication time) are shown in Table~\ref{tab:real} and Table~\ref{tab:matern}. All the results are averaged over 5 random runs.

It can be observed that the PP-GPR achieves a similar predictive mean and variance compared to conventional GPR. The losses are due to the approximation of some SS-based operations (e.g., division) and the fixed-point encoding step.
The computational errors of the Mat{\'e}rn kernel are slightly higher than that of the SE kernel but still remain at a low level. Further analysis revealed that the higher computational error in the Mat{\'e}rn kernel is due to the inclusion of the expression $\sqrt{d(\mathbf{x}, \mathbf{x}^\prime)}$. The square root operation is a non-linear operation that must be approximated using the Newton iterative approach in Crypten~\citet{knott2021crypten}, which results in the higher computational error. In our future work, we plan to conduct further research to investigate this issue.

%
Furthermore, although PP-GPR incurs a longer time than GPR, especially if $n$ is large, it can finish the model construction and prediction in a reasonable time ($< 2$ mins) for a dataset with several hundred observations.

%To demonstrate the generality of our algorithm, we conducted additional experiments on the Matern kernel, as shown in Table~\ref{tab:matern}. We defined the distance function between two vectors $\mathbf{x}$ and $\mathbf{x}^\prime$ as $d(\mathbf{x}, \mathbf{x}^\prime) = ||\mathbf{x} - \mathbf{x}^\prime||^2_2$, and used the following Matern kernel:

In Appendix D, we perform an additional empirical comparison between our algorithm and DP-based GPR under the scenario that only the model outputs are sensitive.
%Due to space limit, the results are in Appendix D.
We believe that this comparison is fair, given that both methods can theoretically preserve privacy.
%
The other privacy-preserving GPR approaches (e.g., FHE-based and FL-based GPR) are not compared since even when operating within the same scenario (i.e., HDS, VDS, or PDS), they may have fundamentally different security assumptions to that of PP-GPR, which ultimately makes them incomparable. See Section~\ref{sec:related} for detailed discussions.
 
%An essential reason why our proposed privacy-preserving GPR algorithm didn't compare with other privacy-preserving approaches mentioned above is as follows:
%Based on the security analysis above, we decided to perform an additional empirical comparison between our algorithm and DP-based GPR under the scenario that only the model outputs are sensitive. See Appendix D for the results. We believe that this comparison will be fair, given that both methods can theoretically preserve privacy.


\section{Related work}\label{sec:related}

% Decentralized/Parallel GP
% Federated GP
% Differentially private (federated) GP
% Secure MPC operators discussion

To the best of our knowledge, there is no existing PP-GPR work that is designed based on SMPC techniques. As has been mentioned in Section~\ref{sec:intro}, although some other privacy enhancement techniques have been applied to GPR, none of them is practical enough to protect the privacy of both the inputs and outputs of GPR for all the three data-sharing scenarios (i.e., HDS, VDS, and PDS).
To be specific, \citet{fenner2020privacy} considers only the PDS scenario and protects the input features of the test data by \emph{fully homomorphic encryption} (FHE) algorithm. Since performing computation on the homomorphically encrypted data incurs high computational costs, they do the PP-GPR prediction through interactive calculations between the user and the model constructor. Such an interactive method, however, cannot be generalized to FHE-based PP-GPR model construction step since the covariance matrix inversion operation is not considered.

Another technique that is widely used to achieve PP-ML models is \emph{differential-privacy} (DP). \citet{smith2018differentially} proposed the first DP-GPR algorithm which can only protect the privacy of the model outputs~$\mathbf{y}$. \cite{kharkovskii2020private} proposed a DP method to protect the input features of the GPR model via random projection. However, this method requires all the observations used for GPR model construction to belong to a single curator and thus, cannot be applied to either HDS or VDS scenarios. In addition, the DP-based method may incur large DP noise to the original model when the privacy budget $\epsilon$ is small, which may significantly reduce the model performance \citep{dwork2014algorithmic}.

Some other works~\citep{dai2020federated,kontoudis2022fully,yue2021federated} consider protecting the privacy of the GPR observations via \emph{federated learning} (FL) or combine FL with DP to further protect the privacy of the model parameters~\citep{dai2021differentially}. To convert the GPR model construction into a distributed/federated manner, these works have to exploit some sparse approximations (e.g., random features) of the conventional GPR, which may reduce the model performance. Moreover, FL-based GPR works can only be applied to the HDS scenario.   

Recently,~\cite{kelkar2021secure} developed a privacy-preserving exponentiation algorithm based on secret sharing techniques in a two-server setting. The communication overhead of this algorithm in the online phase is comparable to that of PP-Exp (i.e., one round of communication and transmission of two elements). However, the algorithm requires an expensive cryptographic primitive (i.e., homomorphic encryption) in the preprocessing phase to generate the random numbers needed in the online phase, resulting in excessive overhead. In addition, the algorithm suffers a certain probability of error from the use of a secure ring change procedure. In the setting of this paper (i.e., $l=64, l_f =26$), the probability that the error occurs is $\frac{1}{4}$, which is unacceptable.

%An essential reason why our proposed privacy-preserving GPR algorithm didn't compare with other privacy-preserving approaches mentioned above is as follows:
Note that even when operating within the same scenario (i.e., HDS, VDS, or PDS), different privacy-preserving approaches may have fundamentally different security assumptions~\citep{yin2021comprehensive, zhang2022no}. %which may ultimately make them incomparable.
Specifically, the privacy of the FHE-based GPR algorithm~\citet{fenner2020privacy} may be at risk even in the PDS scenario due to the decryption steps designed for reducing the high computational cost of the exponential operation in FHE. However, this work provides a solution by effectively addressing the challenges posed by the SS-based exponentiation operation. Consequently, this work guarantees complete privacy protection across the entire PDS process.
The FL-based GPR~\citep{dai2020federated,kontoudis2022fully,yue2021federated} has no theoretical analysis of its privacy-preserving capabilities. This is because the intermediate results (e.g., local model parameters or gradients) generated during the algorithm need to be exchanged between the server and clients. Numerous studies~\citep{zhu2019deep, zhao2020idlg} have demonstrated that these intermediate results pose a potential risk of revealing private data.

%The DP-based GPR algorithm~\citep{smith2018differentially, kharkovskii2020private} provides protection against data leakage by selecting noise levels that satisfy a privacy budget. A certain amount of privacy is achieved by sacrificing the accuracy of the model. However, as has been mentioned above, unlike our proposed algorithm which achieves complete protection for both model inputs and outputs, DP-GPR can only protect the privacy of the model outputs $\mathbf{y}$. 

%Based on the security analysis above, we decided to perform an additional empirical comparison between our algorithm and DP-based GPR under the scenario that only the model outputs are sensitive. See Appendix D for the results. We believe that this comparison will be fair, given that both methods can theoretically preserve privacy. 


%Specifically, the FHE-GPR \cite{fenner2020privacy} and FL-GPR \cite{dai2020federated,dai2021differentially,kontoudis2022fully,yue2021federated} approaches only focus on PDS and HDS scheme, respectively. 
%The DP-GPR methods \cite{kharkovskii2020private,smith2018differentially} assume all the data belong to a single party and can only protect the privacy of either the input features \cite{kharkovskii2020private} or the outputs \cite{smith2018differentially}.

%At present, researchers try to use privacy enhancement techniques such as Differential-privacy(DP)\cite{DP} and Homomorphic Encryption(HE)\cite{HE} to solve the problem of privacy leakage in Gaussian processes. Smith et al. \cite{DP-GP} uses differential privacy technology to achieve the protection of Gaussian process data labels, but does not consider the privacy leakage problem in data features. In \cite{HE-GP}, Fenner et al. considers the scenario where the training data is held by a model constructor. The protection of predicted data features in GP is achieved by using fully homomorphic encryption algorithm\cite{BGV}. At the same time, the computational efficiency and performance of the algorithm are improved through the interactive calculation between the model user and the model constructor.

\section{Conclusion}\label{sec:conclusion}

This paper describes the first SS-based privacy-preserving GPR model which can be applied to both horizontal and vertical data-sharing scenarios. We provide a detailed workflow for implementing both the model construction and prediction steps of PP-GPR. Two additive SS-based operations (i.e., PP-Exp and PP-MI) are proposed such that they can be combined with existing SS-based operations for constructing a secure and efficient GPR model. We analyze the security and computational complexity of the proposed operations in theory.
%The accuracy loss in predictive results of the proposed PP-GPR is shown to be small compared to conventional GPR.
%\textbf{Limitations and broader impact.}
Although PP-GPR incurs more computational time due to additional communications between the two computing servers and some additional computing steps, it can perform GPR in an acceptable time with a security guarantee, which is a superior alternative to existing FL and DP-based privacy-preserving GPR approaches when the scale of observations is not large.
%
%In our future work, we will consider optimizing the hyperparameters in a privacy-preserving manner which requires an efficient SS-based protocol for maximizing the log likelihood.
%Moreover, how to further reduce the computational complexity and improve the accuracy of PP-MI is also worth to be explored.   

\section{Acknowledgements}\label{sec:Acknowledgements}
%We’d like to thank all the anonymous reviewers for their careful readings and valuable comments.
This research is partially supported by the National Key Research and Development Program of China (No. 2022ZD0115301), the National Natural Science
Foundation of China (No. 62206139), and a key program of fundamental research from Shenzhen Science and Technology Innovation Commission (No. JCYJ20200109113403826).
%and the Major Key Project of PCL (No. PCL2021A06).

%\newpage
% References
\bibliography{luo_732}
\end{document}
