% \documentclass{uai2023} % for initial submission
\documentclass[accepted]{uai2023} % after acceptance, for a revised
% version; also before submission to
% see how the non-anonymous paper
% would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2023} % ptmx math instead of Computer
% Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2023} % newtx fonts (improves upon
% ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example


%%%%%%%%%% Added packages (start) %%%%%%%%%%
\DeclareMathOperator*{\argmax}{arg\,max}
\DeclareMathOperator*{\argmin}{arg\,min}
\usepackage[normalem]{ulem}
\usepackage{algorithm}
\usepackage{algorithmic}
\newcommand{\Comment}[2][.5\linewidth]{\leavevmode\hfill\makebox[#1][l]{//~#2}}

\makeatletter
\DeclareRobustCommand\onedot{\futurelet\@let@token\@onedot}
\def\@onedot{\ifx\@let@token.\else.\null\fi\xspace}
\def\eg{\emph{e.g}\onedot} \def\Eg{\emph{E.g}\onedot}
\def\ie{\emph{i.e}\onedot} \def\Ie{\emph{I.e}\onedot}
\def\etc{\emph{etc}\onedot} \def\vs{\emph{vs}\onedot}
\def\wrt{w.r.t\onedot} \def\dof{d.o.f\onedot}
\def\etal{\emph{et al}\onedot}
\setcitestyle{citesep={;}}

\usepackage{booktabs}
\usepackage{xspace}
\usepackage{dsfont}
\usepackage{multirow}
\usepackage{array}
\newcolumntype{P}[1]{>{\centering\arraybackslash}p{#1}}
\newcolumntype{L}[1]{>{\raggedleft\arraybackslash}p{#1}}


%%%%%%%%%% Added packages (end) %%%%%%%%%%

\title{Noisy Adversarial Representation Learning for Effective and Efficient Image Obfuscation}

% The standard author block has changed for UAI 2023 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author{Jonghu Jeong$^*$}
\author{Minyong Cho$^*$}
\author{Philipp Benz}
\author{Tae-hoon Kim}

\affil{%
    Deeping Source Inc.\\
    Seoul\\
    Republic of Korea
}

\begin{document}
\maketitle
\def\thefootnote{*}\footnotetext{Equal Contribution. Our code is available on the GitHub repository: https://github.com/DeepingSource/noisy-arl}

\begin{abstract}
Recent real-world applications of deep learning have led to the development of machine learning as a service (MLaaS).
However, the scenario of \textit{client-server inference} presents privacy concerns, where the server processes raw data sent from the user's client device. One solution to this issue is to provide an \textit{obfuscator} function to the client device using Adversarial Representation Learning (ARL). Prior works have primarily focused on the privacy-utility trade-off while overlooking the computational cost and memory burden on the client side. In this paper, we propose an effective and efficient ARL method that incorporates feature noise into the ARL pipeline. 
We evaluated our approach on various datasets, comparing it with state-of-the-art ARL techniques. Our experimental results indicate that our method achieves better accuracy, lower computation and memory overheads, and improved resistance to information leakage and reconstruction attacks.
\end{abstract}

\section{Introduction}
In recent years, machine learning as a service (MLaaS) has gained popularity mainly due to cloud computing and deep learning advances. Often raw data generated on an edge device is sent to the cloud, where machine learning algorithms process it. However, transferring raw data has the drawback of directly leaking privacy-related information to the cloud server, which might violate user privacy. For example, we can consider an edge device transmitting images to the cloud to perform person identification. While a person's picture can be used for identification, the image can further reveal the person's gender, emotional state, race, or location.
Ideally, before transmission to the cloud server, privacy-related information should be removed from the images while preserving task utility. Additionally, such a private data representation should be secure against attacks from adversarial actors who attempt to breach a user's privacy by retrieving private attributes from the data representation. It is important to note that the service provider might also be considered a possible adversarial actor. Hence, it is in the clients' interest to remove utility-unrelated information since the representation transmitted from the client is out of their control.
We refer to this scenario as \textit{client-server inference}:
(1) on the client-side, the privacy-related information is removed from the data. After the data is transmitted, (2) the server-side performs the remaining inference computation without violating the user's privacy.

Many works have focused on the training framework of ARL~\citep{roy2019mitigating,bertran2019adversarially,li2021deepobfuscator,edwards2015censoring,raval2017protecting,huang2017context,wu2018towards,pittaluga2019learning,xiao2020adversarial,ng2022ninjadesc,osia2018deep,mireshghallah2020shredder,mireshghallah2021not} to mitigate the leakage of sensitive attributes in the context of client-server inference.
Commonly, the ARL framework consists of three entities, (1) an obfuscator, which transforms input data into a representation that retains task utility while resolving the correlation of image features to private attributes, and (2) a task model, performing the utility task on the new data representation and (3) a proxy-adversary, attempting to extract sensitive attributes from the representation.
In the above scenario of MLaaS, the service providers train an obfuscator, a task model, and a proxy adversary. Then, they deploy the obfuscator to the user's client device. For the sake of the users' privacy, the obfuscator should effectively remove all information unrelated to the utility task while using as least resources of the client as possible and retaining high utility with the obfuscated representation.

Another critical aspect of the obfuscator is client-side computational cost. Client-side resource burden should be as least as possible. To this end, we need a training scheme that outputs a lightweight obfuscator.
Hence, in this work, we introduce a novel ARL approach to improve an obfuscator to be (1) robust against attacks of adversarial actors while retaining task utility (\textit{privacy-utility trade-off}) and (2) limiting the computational resources on the edge device (\textit{efficiency-performance trade-off}).
To solve these problems simultaneously, we extend the standard ARL training scheme and propose \textit{noisy adversarial training} and \textit{noisy inference}. The proposed method utilizes off-the-shelf convolutional neural network (CNN) models for mobile-friendly and privacy-preserving machine learning inference.
We demonstrate that our method outperforms the state-of-the-art ARL methods in terms of (1) privacy-utility trade-off, (2) efficiency-performance trade-off, (3) is readily applicable to commonly used CNN architectures, and (4) robust to privacy leakage and reconstruction attacks.



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Related Works and Background}
\label{sec:related}
\paragraph{Data Privacy in Machine Learning}
Privacy attacks or preservation in machine learning is a vibrant research area with various approaches.
The most well-known threats to data privacy are membership inference attacks~\citep{shokri2017membership}, 
inversion attacks~(\citealp{fredrikson2014privacy}), 
and information leakage attacks~\citep{roy2019mitigating}.
The membership inference attack is not applicable in the context of client-server inference since its purpose is to determine whether a single data point is used for model training.
In the scenario of the client-server inference, attackers attempt to breach the transmitted representations. With the breached data, they can train their adversary models (a) to reconstruct the original data (\textit{inversion attack}) or (b) to retrieve private information that users might not want to reveal (\textit{information leakage attack}).

Recently, various methods have been proposed for privacy-preserving machine learning.
Federated learning~\citep{konevcny2016federated} and split 
learning~\citep{gupta2018distributed,vepakomma2018split} are methods to train a new machine learning model without the direct input of raw user data.
However, these methods focus on protecting privacy during the training phase, not at the inference stage.

In various contexts of deep learning, membership inference attacks have been defended with differential privacy~\citep{dwork2008differential} by adding noise to the representations~\citep{abadi2016deep,arachchige2019local,fan2019differential,fan2019practical,croft2021obfuscation,chen2021perceptual}.
However, differential privacy is designed to make two neighboring datasets or data points statistically indistinguishable, not to transform raw data into a new representation that is privacy-safe and usable for the intended tasks~\citep{zhao2020trade}.
Training deep neural networks (DNN) with cryptographic methods has also been explored, such as secure multiparty computation~\citep{cramer2015secure} and homomorphic encryption~\citep{nandakumar2019towards}. 
However, these are still difficult to deploy in practice due to the computational complexity of the involved operations.

\paragraph{Adversarial Representation Learning}
In the context of the client-server inference, the most suitable solution is to find a function that transforms data into a new representation that can be utilized for machine learning while being robust to privacy leakage attacks.
For this purpose, adversarial representation learning (ARL) tries to find a representation function by optimizing an information-theoretic formulation of privacy and utility~\citep{hsu2021survey}.
In the information-theoretic formulation, we represent the original data, the transformed representation, and the sensitive information from the original data as three random variables $X$, $Z$, and $Y$, respectively.
To protect privacy, the mutual information between the sensitive information and the transformed representation, $I(Y; Z)$, should be as minimal as possible. Meanwhile, in terms of preserving utility information, $I(X; Z)$, the mutual information between the original and transformed data should be as maximal as possible.
Thus, the objective of ARL is to find a probability distribution $P_{Z|X}$ that minimizes $ I(Y; Z)$ while retaining the utility of the representation to a certain degree:
\begin{equation}\label{eq:ARL}
    \argmin_{P_{Z|X}} I(Y; Z) \quad s.t.\; I(X; Z) \ge u
\end{equation}
where $u$ is the desired utility level.
ARL approaches commonly set $P_{Z|X}$ as a deterministic function $O: X \rightarrow Z$, where $O$ stands for an \textit{obfuscator}. As the objective suggests, the trade-off between privacy and utility is inevitable since ARL is an optimization problem between two conflicting objectives.~\citet{zhao2020trade} and~\citet{wu2018towards} showed that it is possible to define the lower bound of the trade-off formally and confirmed it with experiments on various tasks.

Various prior works have tried to find $O$ in the image domain with deep neural networks~\citep{singh2021disco,roy2019mitigating,edwards2015censoring,raval2017protecting,pittaluga2019learning,xiao2020adversarial,mireshghallah2020shredder,osia2018deep}. They solved Equation~\ref{eq:ARL} by setting up two proxy models, $T: Z \rightarrow Y_t$ and $A: Z \rightarrow Y_p$, where $T$ stands for the \textit{utility task model} and $A$ stands for the \textit{proxy adversary model}. $Y_t$ stands for the information that needs to be inferred for utility, and $Y_p$ is the one that an adversary could leak.
It has been demonstrated that the theoretical bounds for privacy and utility can be empirically reproduced by optimizing three models, $O, T,$ and $A$, simultaneously~\citep{bertran2019adversarially,hsu2020obfuscation,xiao2020adversarial,osia2018deep}.
The optimization is mainly done with stochastic data-driven training of deep neural networks, (a) by cooperatively training $O$ and $T$ to retain utilizable information in $Z$, (b) while training $O$ and $A$ to adversarially learn to remove private information from $Z$.
After the models are fully trained,
$O$ and $T$ are deployed to the client device and the service provider's server, respectively.
Finally, $O$ transforms raw data $X$ to the privacy-safe representation $Z$, and $T$ processes the received $Z$ to infer $Y_t$ for the utility task.

Previous ARL methods differ in various aspects, such as loss function design, model architecture, and training scheme. For example, MaxEnt~\citep{roy2019mitigating} optimizes the entropy-based loss to make the ARL objective task-agnostic. DISCO~\citep{singh2021disco} selectively removes features via pruning filters in the latent space.
DeepObfuscator~\citep{li2021deepobfuscator} introduces a training scheme,
which incorporates an additional adversary model that reconstructs $X$ from $Z$.
Similar to our approach, Shredder~\citep{mireshghallah2020shredder} and DPFE~\citep{osia2018deep} utilize noisy representations. Their information-theoretic analysis has shown that privacy can be guaranteed by adding noise to encoded representations. Shredder learns a set of noise distributions with regard to a fixed neural net encoder, while our method trains an encoder to adapt to a fixed noise distribution. DPFE uses
an auto-encoder
during training and noise addition during testing to provide privacy, while our method only uses noise addition for both the training and testing phase.
While some methods try to reduce the client-side computation burden with ARL~\citep{mireshghallah2020shredder,osia2018deep}, we propose that privacy can be achieved by choosing a split point of a model and simply training it with a noisy adversarial representation learning scheme.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Problem Formulation}
\label{sec:formul}
As mentioned in Section~\ref{sec:related}, we define three models for our ARL task; an obfuscator $O: X \rightarrow Z$, a task model $T: Z \rightarrow Y_t$, and an adversary model $A: Z \rightarrow Y_p$. The models are represented through CNNs.
We consider $X$ to be in RGB image domain, \ie~$x \sim X \subset \mathds{R}^{H \times W \times 3}$ where $H$ and $W$ represent height and width.
We define an obfuscator $O$, which aims to convert each data point $x$ into the obfuscated representation $z \in Z$.
We also set the utility task model $T$ and the adversarial attacker $A$, which are to infer utility attributes $y_t$ and private attributes $y_p$ from the obfuscated representation $z$, respectively, such that $T(z) = \hat{y}_t \simeq y_t$ and $A(z) = \hat{y}_p \simeq y_p$.
From the attacker's perspective, $\hat{y}_p$ should be similar to the private attributes $y_p$.
In terms of the user and the service provider, however, $\hat{y}_p$ should be as dissimilar as possible from $y_p$ while $\hat{y}_t$ should be similar to the utility attributes $y_t$.
Previous works~\citep{bertran2019adversarially, singh2021disco} have shown that the ARL training scheme effectively achieve the mutual information objective (Eq.~\ref{eq:ARL}).

As an example of a practical attack scenario, we assume an attacker who is in control of an edge device such as a CCTV camera or an IoT device.
The device holds an obfuscator model and transforms the raw data before data transmission. It allows attackers to generate their own datasets to train an adversary model, \eg~original input and obfuscated representation pairs. Further, we assume that the attackers are also aware of the original training dataset and the architecture of the service provider's models. Note that this constitutes a strong threat model, which makes it difficult to protect privacy for the service provider.
We show that our method protects privacy even under severe conditions.

We identify two possible attack scenarios: \textit{information leakage attack} and the \textit{reconstruction attack}.
For the \textit{information leakage attack}, an attacker can attempt to train a model $A_{leak}: Z \rightarrow Y_p$ that directly leaks the representations' private information, \ie~$A_{leak}(z) = \hat{y}_p$.
In the \textit{reconstruction attack}, the attacker attempts to obtain a model $A_{recon}: Z \rightarrow X$ which retrieves the original image from the intermediate representation, from which the private attributes can then be inferred $A_{recon}(z) = \hat{x}$. 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Methodology}\label{sec:method}
\begin{figure}[t]
\centering
    {\includegraphics[width=4.6cm,bb=0 0 802 538]{figures/training.png} }
    \qquad
    {\includegraphics[width=5.2cm,bb=0 0 913 559]{figures/inference.png} }
\caption{(Top) The training scheme of our method. (Bottom) Inference scenario with possible adversary attack.
}
\label{fig:main_figure}
\end{figure}

\subsection{Noisy Adversarial Representation Learning}
As shown in Figure~\ref{fig:main_figure}, we split an off-the-shelf CNN, such as ResNet~\citep{he2016deep} $M(x) = (M_2 \circ M_1)(x)$, and use the earlier layers $M_1$ as a client-side encoder while using the remaining layers $M_2$ as a server-side task model $T$. We discuss the influence of the splitting point on the efficiency-performance trade-off in Section~\ref{section:effi-perf-trade-off-split}.
The encoded feature $M_1(x)$ is then added with noise $\eta \in \mathds{R}^{H_z \times W_z \times C_z}$ sampled from a Gaussian distribution $\mathcal{N}(0,\sigma^{2})$ where $H_z, W_z$, and $C_z$ represent height, width, and number of channels of $z$, respectively. 
The obfuscated feature is $z = O(x) = M_1(x) + \eta$, which will be transmitted from the client-side to the server.


\begin{algorithm}[tb]
\caption{Noisy ARL Training Algorithm}
\label{alg:noisy_arl}
\textbf{Input}: Dataset $\mathcal{D}: \mathcal{X} \times \mathcal{Y}_t \times \mathcal{Y}_p$, model initial parameters $\theta_{O=M_1}$, $\theta_{T=M_2}$, $\theta_A$, loss functions $\mathcal{L}_t$, $\mathcal{L}_{leak}$, noise standard deviation $\sigma$, loss balance parameter $\lambda$, mini-batch size $m$, number of iterations $I$\\
\textbf{Output}: Parameters $\hat\theta_O, \hat\theta_T$

\begin{algorithmic}[1] %[1] enables line numbers
\FOR{iteration $=1, \dots, I$}
\STATE $B \sim \mathcal{D}$: $|B| = m$ \Comment{Sample mini-batch}
\STATE $\eta \sim \mathcal{N}(0,\sigma^{2})$ \Comment{Sample noise}
\STATE $z \leftarrow M_1(x) + \eta$
\STATE $g_T \leftarrow \nabla_{\theta_T} \mathds{E}_{(x, y_t) \sim B} [ \mathcal{L}_t(T(z), y_t)]$\\
\Comment{Calculate gradient}
\STATE $g_A \leftarrow \nabla_{\theta_A} \mathds{E}_{(x, y_p) \sim B} [ \mathcal{L}_{leak}(A(z), y_p)]$\\
\Comment{Calculate gradient}
\STATE $g_O \leftarrow \nabla_{\theta_O} \mathds{E}_{(x, y_t ,y_p) \sim B} [ \mathcal{L}_t(T(z), y_t)$\\
$- \lambda * \mathcal{L}_{leak}(A(z), y_p)]$\\
\Comment{Calculate gradient}
\STATE {$\theta_O$, $\theta_T$, $\theta_A$} $\leftarrow$ \\ Optim($\theta_O$,$g_O$), Optim($\theta_T$,$g_T$), Optim($\theta_A$,$g_A$)\\ \Comment{Update parameters}
\ENDFOR
% \RETURN{$\hat\theta_O, \hat\theta_T, \hat\theta_A$}
\RETURN{$\theta_O, \theta_T$}
\end{algorithmic}
\end{algorithm}


During training (Figure~\ref{fig:main_figure} Top), we train the obfuscator model jointly with the task model and the proxy adversary.
We combine our noise module with the standard ARL training scheme. 
The obfuscator $O$, the task model $T$, and the proxy adversary network $A$ are parameterized through their respective weights $\theta_O$, $\theta_T$, and $\theta_A$. Since the objective of the task model is to perform well on the utility task, its loss is defined as $l_t =\mathds{E}_{x\sim \mathcal{X}}[\mathcal{L}_t(T(z), y_t)]$, where $\mathcal{L}_t$ indicates the task loss function. In case the information leakage attack is chosen as the proxy adversary, the adversary loss can be calculated with $l_{leak}=\mathds{E}_{x\sim \mathcal{X}}[\mathcal{L}_{leak}(A_{leak}(z), y_p)]$, where $\mathcal{L}_{leak}$ indicates the adversary task loss function.
We use simple cross-entropy for $\mathcal{L}_t$ and $\mathcal{L}_{leak}$.
Some ARL methods~\citep{singh2021disco, li2021deepobfuscator} used the proxy reconstruction loss to defend against the reconstruction attack, \ie, $l_{recon}=\mathds{E}_{x \sim \mathcal{X}}[\mathcal{L}_{recon}(A_{recon}(z), x)]$, additional to $l_{leak}$. In Section~\ref{section:recon_attack}, it is shown that our method is robust to the reconstruction attack.
Finally, the obfuscator model should incorporate two objectives, retaining useful task information while discarding privacy-related information.
Hence, the obfuscation loss $l_o = l_t - \lambda * l_{leak}$ is minimized by the obfuscator, where $\lambda$ is a hyper-parameter that balances the two loss terms. To sum up, the final objective is:  $\min_{\theta_O,\theta_T}\max_{\theta_A} l_o$.

Algorithm~\ref{alg:noisy_arl} describes the optimization process to solve the min-max objective. For each mini-batch, sampled noise is added to the intermediate feature from $O$. Then, noise added feature is fed into $T$ and $A$ to calculate the utility task loss and the proxy adversary loss. With the two losses, the gradients $g_T$ and $g_A$ are computed, while the gradient $g_O$ is computed from the weighted sum of the losses. Finally, the models are updated with the gradients via an off-the-shelf optimizer. Note that $A$ is discarded after training since its purpose is to simulate the possible attacks and give supervision to $O$ and $T$.

During the inference phase (Figure~\ref{fig:main_figure} Bottom), the original images are first fed into the obfuscation module on the client side, which includes the encoder $O$ and the noise module.
The obfuscated feature is then sent to the server, where the final task predictions are performed with the task network. 

\subsection{Evaluation Protocol}
To properly evaluate the performance of our approach, we outline our evaluation protocol in the following. 
\\
\textbf{Performance Bounds} Theoretically, the utility and privacy adversary accuracy is upper bounded by $100\%$ for a globally optimal model. However, this is not true in real situations. Hence, we provide a ``practical" upper bound for utility and privacy through the performance of models trained on the original images for each task, respectively.
\\
\textbf{Information Leakage Attack}
It is common to measure the effectiveness of information leakage attacks with the \textit{privacy-utility trade-off}~\citep{singh2021disco}. The trade-off is measured with the difference ($\Delta$) between the accuracy of the utility task and the leakage attack. In the perspective of privacy protection, the higher, the better for the utility task accuracy, and the lower, the better for the leakage attack accuracy. This naturally leads for $\Delta$ to be higher the better.
Specifically, given a fully trained fixed obfuscator, we calculate the accuracy with a separately trained utility task model and an adversary model.
The separate utility task model is trained to correctly predict target attributes, while the independent privacy leakage attack model is trained to infer private labels. Both models receive obfuscated features as input.
\\
\textbf{Reconstruction Attack} 
Additionally, we consider the reconstruction (inversion) attack.
We perform reconstruction attacks by training CNNs to recover original images from the corresponding obfuscated features.
For example, for face images, reconstructed images should (1) not reveal a person's identity and (2) not show private attributes.
\\
\textbf{Model Efficiency} One focus of this work is to consider the performance on the client-side since its computational capacity can often be restricted on the client-side. We evaluate the performance with two metrics on the client device, Giga FLoating point OPerations (GFLOPs) and memory consumption. We count all floating point operations such as additions, multiplications, and divisions on one-time inference to calculate GFLOPs. We also consider all parameters and buffers of models to measure memory usage.


\begin{table*}[t]
\caption{Comparison of the privacy-utility trade-off ($\Delta$). We compare our method with existing ARL approaches focusing on the privacy-utility trade-off. Regarding $\Delta$, our method outperforms all other methods while showing comparable utility accuracy with the performance bound. Comparison with `No Noise' shows the effectiveness of our
noisy adversarial training and inference.
}
\centering
\scalebox{0.8}{
\begin{tabular}{l | ccc | ccc | ccc }
\toprule
       & \multicolumn{3}{c}{Fairface(Race/Gender)} & \multicolumn{3}{c}{CelebA(Gender/Smiling)} & \multicolumn{3}{c}{CIFAR10(Multi/Binary)} \\
Method & Privacy $\downarrow$ & Utility $\uparrow$ & $\Delta \uparrow$ & Privacy $\downarrow$ & Utility $\uparrow$ & $\Delta \uparrow$ & Privacy $\downarrow$ & Utility $\uparrow$ & $\Delta \uparrow$  \\
\midrule
RN18 & 63.57 & 92.11 & - & 98.14 & 93.48 & - & 94.51 & 98.79 & -  \\
\midrule
Image Noise & 42.61 & 74.33 & 31.72 & 91.71 & 85.38 & -6.33 & 54.37 & 87.77 & 33.40\\
No Noise (RN18$_3$) & 45.22 & 89.55 & 44.33 & 94.54 & 93.38 & -1.16 & 69.34 & 97.64 & 28.30 \\
No Noise (RN18$_4$) & 31.56 & 89.87 & 58.31 & 93.19 & 93.43 & 0.24 & 56.02 & 97.97 & 41.95 \\
\midrule
MaxEnt     & 24.56 & 90.52 & 65.96 & 59.28 & 93.43 & 34.15 & 24.61 & 97.74 & 73.13 \\
DISCO         & 19.00 & 81.50 & 62.50 & 61.20 & 91.00 & 29.80 & 22.30 & 91.98 & 69.68 \\
DeepObfs. & 50.83 & 89.64 & 38.81 & 97.63 & 91.92 & -5.71 &  73.79 & 92.86 & 19.07 \\
\midrule
Ours (RN18$_3$) & 19.47 & 89.08 & \textbf{69.61} & 57.77 & 93.07 & \textbf{35.30} & 21.71 & 96.92 & \textbf{75.21} \\
Ours (RN18$_4$) & 15.60 & 88.34 & \textbf{72.74} & 53.77 & 90.86 & \textbf{37.09} & 19.81 & 94.25 & \textbf{74.44} \\
\bottomrule
\end{tabular}
}
\label{tab:privacy-utility}
\end{table*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Experimental Setup}
\subsection{Implementation Details}
\label{sec:impldetail}

\textbf{Models}
We chose the commonly used CNN, ResNet18 (RN18)~\citep{he2016deep} as the base architecture to split.
RN18 consists of one convolution layer, four residual blocks, and a fully connected layer. We choose the splitting point after each of the four residual blocks. We indicate the different configurations as RN18$_{\{1,2,3,4\}}$, respectively, where the subscript indicates the block after which the network was split. 

Note that for the proxy adversary model, we only consider a proxy for information leakage attack, as discussed in Section~\ref{sec:method}, since we empirically show that our method is robust to reconstruction attacks without considering them during training. 
For the task and proxy adversary models we use the remaining part of the split architecture, \eg~for RN18$_4$, the remaining part would consist of the fully connected layer.
This setting is consistent with previous works~\citep{singh2021disco, roy2019mitigating,li2021deepobfuscator}.
The noise parameter is chosen based on the dataset and model's privacy-utility trade-off.
A separate Adam~\citep{kingma2014adam} optimizer is used for all three models with a learning rate of $10^{-3}$, and $\lambda=10^{-2}$ is used to balance the losses.
\\
\textbf{Settings for Information Leakage Attack}
First, we compare the experiment setting from the previous ARL method DISCO~\citep{singh2021disco}.
We set ``Smiling" as the utility attribute and ``Gender" as the privacy attribute for CelebA~\citep{liu2015faceattributes}, ``Gender" as the utility attribute, and ``Race" as the privacy attribute for FairFace~\citep{karkkainen2021fairface}.
For CIFAR10~\citep{krizhevsky2009learning}, following the setting from MaxEnt~\citep{roy2019mitigating}, the utility task is defined as classifying living objects (\eg ``bird", ``cat", \etc) or non-living objects (\eg ``airplane", ``automobile", \etc) and privacy task as classifying separate 10 classes. All datasets used the official train and validation split. 
Furthermore, results on more complex task settings, such as multi-class classification and facial landmark detection, are provided in the supplementary material.
\\
\textbf{Settings for the Reconstruction Attack}
The reconstruction attack is performed on the CelebA dataset with the decoder architecture from DeepObfs~\citep{li2021deepobfuscator}, which is trained with the Adam optimizer with a learning rate of $10^{-3}$ and MSE between the original and the reconstructed image.
We depict the qualitative results, and additionally provide quantitative visual dissimilarity comparison between the original and reconstructed images.
Various visual metrics are reported, such as MSE, $L_1$, SSIM~\citep{zhou2004ssim}, MS-SSIM~\citep{wang2003multiscale-msssim}, PSNR~\citep{hore2010psnr}, and LPIPS~\citep{zhang2018unreasonable-lpips}, which are commonly considered as proxies for human vision.
Additionally, a qualitative user study is provided in Section~\ref{section:user-study}.
\subsection{Baselines \& Compared Methods} 
\textbf{ResNet18} We report the utility and privacy performance for ResNet18 (RN18) models trained on the respective task with original images to indicate the practical performance bounds.
\\
\textbf{Image Noise} 
Directly adding sufficient noise to the input image is a simple way to obfuscate without any trainable parameters. We add Gaussian noise sampled from $\mathcal{N}(0, \sigma^{2})$ to the input image directly while obeying the image range of pixels in the range (0,1), where $\sigma=2$ is used for CelebA and FairFace and $\sigma=0.8$ for CIFAR10.
The $\sigma$ is chosen based on the noise that fully obfuscates the image for a human observer. We used the entire ResNet18 model for both the utility and privacy models. 
\\
\textbf{No Noise} 
As an ablation experiment on our method, we conduct basic ARL training without a noise module. We report the performance of RN18$_3$ and RN18$_4$.
\\
\textbf{MaxEnt} We compare the ARL method MaxEnt, 
which uses full ResNet18 as a client-side obfuscator. The obfuscator's final output is a vector with length $d$. $d=128$ is used for CIFAR10, which is the original setting from MaxEnt, and $d=256$ is used for FairFace and CelebA by considering the size of the input images.
\\
\textbf{DISCO} We report the privacy-utility trade-off numbers as in the original work. We reconfirm the reconstruction vulnerability of DISCO as reported in their work with their parameters.
\\
\textbf{DeepObfuscator} Since the authors did not open-source their code, we re-implemented DeepObfuscator based on the provided information in the paper.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Experimental Results}
\label{sec:results}
\subsection{Privacy-Utility Trade-off and Efficiency}
\begin{table}[t]
\caption{The efficiency of each client model. 
An image with a size of (178 $\times$ 178 $\times$ 3) is used to measure the performance. 
Our method (Bottom) shows the lowest computational costs compared to all the baselines (Top). 
}
\centering
\scalebox{0.75}{
\begin{tabular}{l | cccc }
\toprule
Benchmark & DeepObfs. & DISCO & MaxEnt & RN18 \\
\midrule
Comp. Cost (GFLOPs) $\downarrow$ & 6.00 & 2.52 & 2.52 & 2.52 \\
Memory (MB) $\downarrow$ & 1.00 & 42.80 & 43.17 & 42.69 \\
\bottomrule
\toprule
Benchmark &  RN18$_1$ & RN18$_2$ & RN18$_3$ & RN18$_4$ \\
\midrule
Comp. Cost (GFLOPs) $\downarrow$ & 0.75 & 1.31 & 1.92 & 2.52 \\
Memory (MB) $\downarrow$ &  0.60 & 2.61 & 10.63 & 42.67 \\
\bottomrule
\end{tabular}
}
\label{tab:efficiency}
\end{table}

Table~\ref{tab:privacy-utility} compares baselines and state-of-the-art methods with our proposed approach regarding the privacy-utility trade-off ($\Delta$). 
First, we observe that `Image Noise' decreases the performance for both privacy and utility.
This is because that the method obfuscates the images without taking utility and privacy tasks into account.
For RN18$_{\{3,4\}}$ without noise, which can be considered an ablation of our method, the utility task accuracy is nearly retained compared to the performance upper bound. However, the adversary achieved high accuracy for the leakage attack compared to other methods. The results show that it is hard for the obfuscator to learn to remove private information even with the adversary proxy loss. Another notable point is that training the obfuscator to remove the private information becomes more challenging with the lower layers, as it can be confirmed with the privacy accuracy gap between RN18$_{3}$ and RN18$_{4}$ (\eg For FairFace $45.22\%$ and $31.56\%$, respectively). 
This phenomenon is discussed in Section~\ref{section:effi-perf-trade-off-split}.

Among the state-of-the-art methods, MaxEnt and DISCO successfully achieve high $\Delta$, with MaxEnt constituting the most robust technique. DeepObfs.\ shows limited privacy protection under our strong evaluation protocol.

Our methods (RN18$_{\{3,4\}}$) show the best $\Delta$ among all the information leakage attack settings.
The privacy accuracy is comparably lower than other methods, such that the biggest $\Delta$ could be achieved even with the utility accuracies that are not always the highest.
These are notable results since our models are efficient and lighter than compared methods, as indicated in Table~\ref{tab:efficiency}.
Our RN18$_4$ is comparable to MaxEnt and DISCO in terms of memory and computational cost. 
Hence, under similar efficiency, we observe that our noisy adversarial training and inference have a noticeable effect on the privacy-utility trade-off, outperforming all baselines and previous approaches with significant margins. 
For example, our approach achieves a privacy-utility trade-off of $72.74\%$ for Fairface, a $7\%$p increase compared to MaxEnt. 

Additionally, our proposed approach with an RN18 split after the third block (RN18$_3$) also achieves a higher privacy-utility trade-off than all methods. This is especially significant since RN18$_3$ achieves a noticeably smaller client-side burden, with an approximate memory reduction by a factor of $4$ and a computational cost of only $76\%$ (1.92 GFLOPS compared to 2.52 GFLOPs) of MaxEnt and DISCO.

While DeepObfs.\ exhibits a comparably small memory footprint, its low $\Delta$ and high computational cost indicate its inferiority to other approaches.

In summary, our noisy adversarial training and inference showed increased $\Delta$ with decreased client-side resources cost, outperforming various methods on other benchmarks. 

\subsection{Reconstruction Attack}
\label{section:recon_attack}
Figure~\ref{fig:reconstruction_results} shows the visual evaluation of the compared methods for the reconstruction attack. First, the reconstruction results of DeepObfs., DISCO, and ‘Image Noise’ show a slightly different identity from the original. However, they are still distinguishable regarding the private attribute, in this case, ‘Gender’. The ‘No Noise’ method appears to have removed the identity and the background context, but it also shows the distinguishable ‘Gender’ attribute. Our method and MaxEnt are the only methods that successfully defended the attack concerning the identity and private attribute. It is noticeable that the reconstructed images of MaxEnt overall show the same identity and the ‘Smile’ task attribute, while nearly no facial features are distinguishable for our method.

\begin{figure}[t]
\centering
\includegraphics[width=0.9\linewidth,bb=0 0 898 1474]{figures/reconstructed_5columns.png}
\caption{
Reconstruction attack on CelebA. Except for our method and MaxEnt, all other methods failed to defend the reconstruction attack. While a few methods (\eg `Image Noise', No Noise (RN18$_4$)) have successfully defended revealing the exact identity of the person, they failed to remove the private attribute (`Gender').}
\label{fig:reconstruction_results}
\end{figure}

\begin{table}[t]
\caption{Quantitative results of the reconstruction attack on CelebA. Visual dissimilarity scores between original and reconstructed images. The result shows that our method outperforms the other methods with all the metrics. }
\centering
\scalebox{0.62}{
\begin{tabular}{l | r | r | P{1.1cm} | P{1.7cm} | P{1.15cm} | P{1.2cm}}
\toprule
Method & MSE ↑ {} & L1 ↑ {} & SSIM ↓ & MS-SSIM ↓ & PSNR ↓ & LPIPS ↑ \\
\midrule
Image Noise & 584.88 & 16.97 & 0.6017 & 0.7776 & 20.46 & 0.3710 \\
No Noise (RN$18_3$) & 1391.39 & 26.89 & 0.4666 & 0.6155 & 16.70 & 0.4882 \\
No Noise (RN$18_4$) & 1841.70 & 31.70 & 0.4558 & 0.5829 & 15.48 & 0.4857 \\
\midrule
MaxEnt & 4955.44 & 58.83 & 0.3893 & 0.4057 & 11.19 & 0.6619 \\
DeepObfs. & 182.63 & 9.47 & 0.7834 & 0.9298 & 25.52 & 0.1864 \\
DISCO & 567.17 & 15.94 & 0.5765 & 0.7611 & 20.60 & 0.4351  \\
\midrule
Ours (RN$18_3$) & 5437.02 & 63.22 & \textbf{0.3086} & 0.1682 & 10.78 & 0.8045  \\
Ours (RN$18_4$) & \textbf{5454.12} & \textbf{63.48} & 0.3301 & \textbf{0.1571} & \textbf{10.77} & \textbf{0.8197} \\
\bottomrule
\end{tabular}
}
\label{tab:sim_score}
\end{table}


Table~\ref{tab:sim_score} reaffirms our results quantitatively. DeepObfs., DISCO, and `Image Noise' have the lowest visual dissimilarity, since they showed similar identity and private attribute on Figure~\ref{fig:reconstruction_results}. 
`No Noise' and MaxEnt have shown high dissimilarity, but still, our method shows the best score since no distinguishable objects are present.

In summary, our method showed the best robustness to the reconstruction attack in terms of both qualitative and quantitative results. Note that our method is robust against the reconstruction attack even without incorporating it into our optimization process.
With our noisy adversarial training, the obfuscator successfully learns an obfuscated representation robust against the reconstruction attack, while retaining task utility and removing private information.

\section{User Study}
\label{section:user-study}
\begin{figure}[ht]
\centering
\includegraphics[width=0.8\linewidth,bb=0 0 585 372]{figures/user_study_gender.png}
\\
\vspace{0.1cm}
\includegraphics[width=0.8\linewidth,bb=0 0 581 369]{figures/user_study_smiling.png}

\caption{Results for user study on reconstructed images.}
\label{fig:user_study}
\end{figure}

In addition to the quantitative results of the reconstruction attack,
we present a user study to show that our method's robustness against reconstruction is aligned with human vision.
We obfuscate the images using each technique shown in Figure~\ref{fig:user_study}.
Finally, we attack the obfuscated images using settings from Section~\ref{sec:impldetail}.
We randomly selected 30 people and set 270 images from the reconstruction attackers for the survey.
We asked them to distinguish whether the person in the image is either smiling for the ``Smiling" utility task, or whether the person is male or female for the ``Gender" privacy task.
We also provide the option to select ``cannot judge" for instances in which the reconstructed image is too obfuscated for the person's gender or facial expression to be discernible.

As shown in Figure~\ref{fig:user_study},
we present that users correctly identified the ``Gender" attribute on the reconstructed images when the obfuscated representations are from DeepObfs., DISCO, and No Noise (RN18$_4$).
Only ours and MaxEnt show protections against the reconstruction attack for the ``Gender" classification task.
For the smiling attribute, we note that the wrong proportion of answers on original images is about 20\%.
This result is because judging whether a person is smiling or not is subjective.
The results for the ``Smiling" classification present that all methods other than ours failed to defend against the attacks.
We highlight these results since concealing the ``Smiling" attribute for utility tasks is out of our interest.
We note that RN18$_1$ performs worse compared to (RN18$_{\{3,4\}}$) as expected. However, it outperforms the other compared methods by a considerable margin.
We conclude that our approach successfully hides sensitive information from humans even under the reconstruction attack and outperforms previous methods.
We report detailed settings for impartial results in the supplementary materials.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Ablation Experiments}

\subsection{Efficiency-Performance Trade-off for Different Split Points}
\label{section:effi-perf-trade-off-split}
Choosing the split point from the lower layers of the model results in compromised privacy, whereas choosing from the higher layers is advantageous for preserving privacy~\citep{yosinski2014transferable}.
However, choosing a split from the higher layer increases the client-side burden due to higher memory usage and computational cost.
Thus, selecting an appropriate splitting point is essential by considering the trade-off between model efficiency and performance.

We report the privacy and utility accuracy for each variant of ResNet18 model in Figure~\ref{fig:priv_util_each_block}.
The result confirms that a better performance ($\Delta$) can be achieved with a split point at higher layers.
The privacy accuracy decreases as the split is at higher layers since the features are being processed to be more specific to the utility task. The utility accuracy remains similar, independent of the split location, which increases $\Delta$ for the higher layers.
Note that our approach has comparable performance even with the lower layer split model. For example, RN18$_2$ achieves $88.10\%$ for utility accuracy and $22.38\%$ for privacy, which leads to the privacy-utility gap of $65.72\%$. This is $3.22\%$p better than DISCO and only $0.24\%$p worse than MaxEnt while having only half the computational cost and $1/16$ memory burden.


\begin{figure}[t]
\centering
\includegraphics[width=1.0\linewidth,bb=0 0 1189 434]{figures/priv_util_each_block.png}
\caption{Performance-efficiency trade-off. The result shows better performance when choosing the split point from the higher layers.
Nevertheless, our approach shows comparable performance even with the split at the lower layers.}
\label{fig:priv_util_each_block}
\end{figure}

\subsection{Other Network Architectures}
\label{section:other_network}
We further investigate whether our method applies to other network architectures. 
We use three commonly used CNN models, MobileNetV2~\citep{sandler2018mobilenetv2} split at the 16th convolution layer out of 19, AlexNet~\citep{krizhevsky2012imagenet_alexnet} split at the fifth convolution layer out of 8, VGG11~\citep{simonyan2014very_vgg} split at the fourth convolution layer out of 5 with $\sigma=30,15,60$, respectively. 
For each model, we followed the evaluation protocol for the FairFace dataset.
The models are trained with proxy adversarial loss (Adv loss) and our proposed method (ARL with noise).
Table~\ref{tab:model-ablation} confirms that our method also works with various generally adopted architectures. For all three models, the utility is reasonably retained while effectively protecting privacy.
This result shows that the training scheme with our noise module can be readily applied to off-the-shelf model architectures. The computational cost and memory usage is also compared between the original and split model. It is noticeable that the MobileNetV2 architecture could even further reduce the computational burden on the client side with our method. 

\begin{table}[t]
\caption{Model Ablation. Trained on FairFace dataset.}
\centering
\scalebox{0.6}{
\begin{tabular}{ l | c | L{1.05cm} | P{1.3cm} |P{1.2cm}|c }
\toprule
Method & GFLOPs & Memory (MB) & Privacy $\downarrow$ & Utility $\uparrow$ & $\Delta\uparrow$\\
\midrule
MobileNetV2 (MNV2)  Orig. & 0.4457 & 8.62 & 54.40 & 91.07 & 36.67 \\
\midrule
MNV2$_{16}$ + Adv loss & \multirow{2}{*}{0.3585} & \multirow{2}{*}{3.97} & 38.23 & 90.48 & 52.25 \\
MNV2$_{16}$ + Noise + Adv loss &  & & 22.22 & 90.07 & \textbf{67.85}\\
\midrule
AlexNet Orig. & 0.8952 & 217.47 & 61.60 & 88.47 & 26.87\\
\midrule
AlexNet$_4$ + Adv loss & \multirow{2}{*}{0.6679} & \multirow{2}{*}{7.17} & 51.09 & 88.42 & 37.33 \\
AlexNet$_4$ + Noise + Adv loss &  &  & 31.59 & 86.79 & \textbf{55.20} \\
\midrule
VGG11 Orig. & 9.5548 & 491.26 & 65.86 & 90.66 & 24.80\\
\midrule
VGG11$_5$ + Adv loss & \multirow{2}{*}{5.8864} & \multirow{2}{*}{8.19} & 63.40 & 89.62 & 26.22\\
VGG11$_5$ + Noise + Adv loss & &  & 42.11 & 87.80 & \textbf{45.69} \\
\bottomrule
\end{tabular}
}
\label{tab:model-ablation}
\end{table}

\section{Discussion}
In terms of adding noise to the representation, differential privacy based methods~\citep{wang2018not} for privacy protection in deep learning might seem similar to ours.
However, differential privacy is not designed to be robust against the information leakage attack.
Our method can consider possible information leakage attacks in advance by following the information-theoretic ARL formulation. Further, efficiency improvements can be achieved via pruning~\citep{han2015deep}, quantization~\citep{jacob2018quantization}, and knowledge distillation~\citep{hinton2015distilling} which are orthogonal to our proposed method.

\section{Conclusion}
We proposed a novel ARL method that incorporates feature noise during training and inference.
Compared to SOTA ARL methods,
our approach achieves better accuracy, lower computation and memory overheads, and stronger resistance to information leakage and reconstruction attacks. In particular, we conducted a user study to validate the qualitative superiority of our method against reconstruction attacks.
Moreover, with thorough ablation experiments, we demonstrated the insight for choosing model split points and the general applicability of our method to off-the-shelf CNNs.
Overall, our findings highlighted the potential of feature noise in ARL as a promising direction for future research.
% References
\bibliography{jeong_184}
\end{document}
