% \documentclass{uai2022} % for initial submission
\documentclass[accepted]{uai2022} % after acceptance, for a revised
                                    % version; also before submission to
                                    % see how the non-anonymous paper
                                    % would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2022} % ptmx math instead of Computer
                                         % Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2022} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

% %% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\section*{References}}
% \usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
% \usepackage{booktabs} % commands to create good-looking tables
% \usepackage{tikz} % nice language for creating drawings and diagrams
% Use the postscript times font!
% \usepackage{times}
% \usepackage{soul}
% \usepackage{url}
% \usepackage[hidelinks]{hyperref}
% \usepackage[utf8]{inputenc}
% \usepackage[small]{caption}
% \usepackage{graphicx}
% \usepackage{amsmath}
% \usepackage{amsthm}
% \usepackage{booktabs}
% \usepackage{algorithm}
% \usepackage{algorithmic}
% \urlstyle{same}
% \usepackage{natbib}
% \usepackage{url}
% \usepackage{wrapfig}
% \usepackage{paralist}
% \usepackage{enumitem}
\usepackage{algorithm}
\usepackage{algorithmic}
% \usepackage{amsmath,amssymb}
\newcommand{\Real}{\mathds{R}}
\newcommand{\vect}[1]{\mathbf{#1}}
% \usepackage{subfigure}
\usepackage{subcaption}
\usepackage{wrapfig}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage{amsmath,amsthm}
\newtheorem{lemma}{Lemma}
\newtheorem{observation}{Observation}
\newtheorem{theorem}{Theorem}
\DeclareMathOperator*{\argmin}{argmin}
\DeclareMathOperator*{\argmax}{argmax}

\setcounter{secnumdepth}{2} %May be changed to 1 or 2 if section numbers are desired.

\newcommand{\hf}[1]{{\color{blue}  [\text{Haifeng:} #1]}}
\newcommand{\chenghan}[1]{{\color{magenta}  [\text{Chenghan:} #1]}}
\newcommand{\thanh}[1]{{\color{red}  [\text{Thanh:} #1]}}

\newtheorem{proposition}{Proposition}

%\title{Persuasion in Bayesian Security Games with Multiple Non-Coordinated Defenders}
\title{Information Design for Multiple Independent and Self-Interested Defenders: \\Work Less, Pay Off More}
\author[1]{Chenghan Zhou}
\author[2]{Andrew Spivey}
\author[3]{Haifeng Xu}
\author[2]{Thanh Hong Nguyen}
% Add affiliations after the authors
\affil[1]{%
    Computer Science Department, 
    University of Virginia,
    Charlottesville, Virginia, USA
}
\affil[2]{%
    Computer Science Department, 
    University of Oregon,
    Eugene, Oregon, USA
}
\affil[3]{%
    Department of Computer Science, 
    University of Chicago,
    Chicago, Illinois, USA
}
% \author{Thanh H. Nguyen}
\date{}



\begin{document}
\maketitle
\begin{abstract}
   This paper studies the problem of information design in a general security game setting in which multiple independent self-interested defenders attempt to provide protection simultaneously on the same set of important targets against an unknown attacker. A principal, who can be one of the defenders, has access to certain private information (i.e., attacker type) whereas other defenders do not. We investigate the  question of how that principal, with additional private information, can influence the decisions of the defenders by partially and strategically revealing her information. 
%   We focus on the algorithmic study of information design for private signaling in this game setting.
   In particular, we develop a polynomial-time ellipsoid algorithm to compute an optimal private signaling scheme. Our key finding is that the separation oracle in the ellipsoid approach can be carefully reduced to bipartite matching. Furthermore, we introduce a compact representation of any ex-ante persuasive signaling schemes by exploiting intrinsic security resource allocation structures, enabling us to compute an optimal scheme significantly faster. Our experiment results show that by strategically revealing private information, the principal can significantly enhance the protection effectiveness on the targets.  
\end{abstract}
\section{Introduction}
In many real-world security domains, there are often multiple  self-interested security teams who conduct patrols over the same set of important targets without coordinating with each other~\citep{jiang2013defender}. Among others, an important motivating domain of this paper is  wildlife conservation ---  while patrol teams from various NGOs or provinces patrol within the same conservation area to protect wildlife from poaching. Different NGOs or provinces typically have different types of  targeted species (e.g., the situation in Pakistan \citep{cite1}) and  tend to operate  separately. Similarly, there are multiple different countries which simultaneously plan their own anti-crime actions in international waters against illegal fishing~\citep{Klein.2017}. 

The study of multi-defender security games has attracted much recent attention. Unfortunately, most findings so far are relatively \emph{negative}. Specifically, \citep{lou2017multidefender} show that the lack of coordination among defenders may significantly lessen the overall protection effectiveness, leading to \emph{unbounded} price of anarchy. In addition, \citep{Jiarui2018Stackelberg} recently show that finding a Nash-Stackelberg equilibrium among the defenders, taking into account strategic response of the attacker, is computationally \emph{NP-hard}. Given these negative results,  this paper asks the following question: 
% \textbf{How to obtain protection effectiveness and computation efficiency in multi-defender security games?}
\begin{quote}
\it How to obtain defense effectiveness and computation efficiency in multi-defender security games?
\end{quote}

To answer the above question, we exploit the use of \emph{information} as a ``knob''  to coordinate strategic agents' decisions. Specifically, we study how a principal with privileged private information can  influence the decisions of all  defenders by strategically reveal her  information, a task also known as \emph{information design} or \emph{persuasion} \citep{Dughmi2017}. %We are particularly interested in algorithmic aspects of information design, and its impact on security status at the targets.   
Concretely, we study information design in a Bayesian security game setting with multiple independent and self-interested defenders. These defenders attempt to protect important targets against an unknown attacker. The attacker \emph{type} is unknown to the defenders. Nevertheless, all defenders share a common knowledge of a prior distribution over the attacker types. In this setting, there is a principal who has additional information about the attacker type and wants to communicate with both the defenders and the attacker through a persuasion signaling mechanism in order to influence all of their decisions towards the principal's goal. In wildlife protection, for example, the principal may be the national park office. Since many poachers (or the attacker) are local villagers,  park rangers can have access to private information through local informants about whom (i.e., which attacker type) is conducting poaching~\citep{viollaz}.   
% \subsection{Summary of Contributions.} 


In summary, our results show  that information design not only significantly improves protection effectiveness but also leads to efficient computation. Concretely, %This paper focuses on designing optimal private signaling schemes in which the principal sends signals to each player privately. Our main results are a set of polynomial-time algorithms to find an optimal signaling scheme for the principal. In particular, 
assuming the principal can communicate with defenders privately (a.k.a., private signaling \citep{Dughmi2017algorithmic}), we develop an ellipsoid-based algorithm in which the separation oracle component can be decomposed into a polynomial number of sub-problems, and each sub-problem reduces to a bipartite matching problem. We remark that this by no means is an easy task, neither conceptually nor technically, since the outcomes of private signaling form the set of Bayes correlated equilibria \citep{Bergemann16Bayes} and computing an optimal correlated equilibrium is a fundamental and  well-known intractable problem \citep{papadimitriou2008computing}. Our proof is technical and  crucially explores the special structure of security games. In addition, we also investigate the ex-ante private signaling scheme (a relaxation of private signaling in which the defenders and attacker decides whether to follow the principal's signals or not before any signal is realized \citep{castiglioni2021signaling}). In this scenario, we develop a novel compact representation for the principal's signaling schemes by compactly characterizing jointly feasible marginals. This finding  enables us to significantly reduce the signaling scheme computation. 
% Moreover, this result also strictly generalizes the  mixed strategy sampling approach in classic security games \citep{kiekintveld2009computing}, and may be of independent interest.  

Finally, we present extensive experiment results evaluating our proposed algorithms in various game settings. We evaluate two different principal objectives: (i) maximizing the defenders' social welfare; and (ii) maximizing her own utility. Our results show that through signaling schemes, the principal can significantly increase the social welfare of the defenders while substantially reducing the attacker's utility. 
% This means that the protection effectiveness on the targets are also enhanced significantly.  

\subsection{Comparison with Previous Works }
\textbf{Security Games.} Security games refer to a well-studied class of games which capture strategic interactions between defenders and attackers in security domains~\citep{tambe2011security}, with important real-world applications  in, e.g.,  airport security~\citep{pita2008deployed},  ferry protection~\citep{shieh2012protect}, and wildlife conservation~\citep{fang2016deploying}. %For a complete survey of security game literature, we refer readers to~\citep{sinha2018stackelberg}. 
%A majority of existing work on security games focuses on the single-defender (with multiple security resources) scenarios. 
Most relevant to our work is the   recent study of multiple-defender security games. Several of them consider defenders to have identical interests~\citep{jiang2013defender,basilico2017computing} or to have their own disjointed set of targets~\citep{laszka2016multi,lou2017multidefender,lou2016decentralization,lou2015equilibrium,smith2014multidefender}. The game model in~\citep{Jiarui2018Stackelberg} is the most related to ours. This previous work investigates the existence and computation of a Nash-Stackelberg equilibrium among the defenders. To our knowledge, our work is the first to  study  information design in multi-defender security games. In contrast to previous negative results, our findings are much more encouraging. Our positive results even extend to more realistic game models  with   defender patrolling costs, which cannot be handled by the existing work.   

\textbf{Information Design. } Information design, a.k.a. signaling, has attracted much interest in various domains such as public safety  \citep{Xu15,Rabinovich15}, wildlife conservation \citep{bondi2019broken}, traffic routing \citep{vasserman2015implementing,castiglioni2021signaling} and auctions \citep{li2019revenue,Emek12}. Most related to us is \citep{xu2016signaling} which study signaling in Bayesian Stackelberg games. All previous work assumes a single defender whereas our paper tackles the complex \emph{multiple-defender} setup. This requires us to work with exponentially large representation of signaling schemes and necessitates novel algorithmic techniques with compact representations. 


\textbf{Other Learning-based Solutions.} 
Recent research in multi-agent reinforcement learning studies factors that influence agents' behavior in a shared environment. For example,  \citep{Tian2020communicate} studies how to convey private information through actions in cooperative environment. \citep{jaques2019socialinfluence} uses monetary reward (which they call causal inference reward) to influence opponents' actions. Unlike the tools studied in previous MARL literature, our model takes advantage of information asymmetry to influence attackers' actions in an adversarial environment. Therefore, both our setup and approach are different from these previous learning-based methods. 

% \section{Background}

 

% In addition to $|\mathbf{T}|$ targets, we create $|\mathbf{T}| - \mathbf{D}$
% We also add a dummy target $0$ to $\mathbf{T}$ which indicates that if the defender goes to target $t = 0$, it means the defender does not protect any actual targets. We set $P(d, 0) = R(d, 0) = P(\lambda, 0) = R(\lambda, 0) = 0$. 

% We study a persuasion scenario in which there is a principal who can sends signals to both sides. 
% \newpage

\section{Preliminary}
We consider a general security game setting in which there are multiple self-interest defenders, $\mathbf{D} = \{1,\dots, |\mathbf{D}|\}$, who have to protect important targets $\mathbf{T} = \{1,\dots, |\mathbf{T}|\}$ from an attacker. Each defender can protect at most 1 target.\footnote{This is w.l.o.g since any defender who can cover multiple targets can be ``split'' into multiple defender with the same utilities.} The defenders do not know the attacker's type, but share  common prior knowledge about the distribution over possible attacker types, $\{q(\lambda)\}_{\lambda \in \Lambda}$ with $\Lambda = \{1,\dots, |\Lambda|\}$,  where $q(\lambda)$ is the probability that the attacker has type $\lambda$. If a defender $d$ decides to go to a target $t$, he has a patrolling cost of $C^d(t) < 0$. If the attacker $\lambda$ successfully attacks a target $t$, it receives a reward $R^{\lambda}(t) \geq  0$ while each defender $d$ receives a penalty $P^{d}(t) \leq 0$. Conversely, if any of the defender catches the attacker $\lambda$ at $t$, the attacker receives a penalty $P^{\lambda}(t) < 0$ while each defender $d$ obtains a reward $R^{d}(t) > 0$. Notably, one defender suffices to fully protect a target whereas multiple defenders on the same target will \emph{not} be any more effective. This is the major reason of inefficiency without coordination \citep{lou2017multidefender}.  
% This game setting is suitable for real-world domains such as wildlife protection in which there are often multiple independent ranger teams from either different NGOs or the national park office. These ranger teams (defenders) attempt to protect wild animals by conducting patrols within a conservation area. On the other hand, poachers (the attacker) aim at catching animals by setting trapping tools such as snares inside the conservation area. 

% This wildlife protection problem can be represented as a security game in which the conservation area is divided into a grid and each grid cell represents a target. 
\section{Optimal Private  Signaling}
We first study the design of private signaling schemes which help the principal to coordinate the defenders. The principal leverages her private information about the attacker type to influence the decisions of all players (including the attacker) by strategically revealing her information. We adopt the standard assumption of information design \citep{Kamenica2011}, and assume that the principal commits to a signaling scheme $\omega$ and $\omega$ is publicly known to all players. At a high level, a private signaling scheme generates a random variable called \emph{signal profile} $\mathbf{s}$, which is correlated with $\lambda$, where $s(d)$ is the private signal sent to the defender $d$ and $s(a)$ is the signal sent to the attacker. Each defender $d$, once receiving a certain private signal $s_0$, updates his belief on the attacker type, using Bayes rule as follows:
\begin{align*}
    & P(\lambda\mid s_0) = \frac{q(\lambda) \sum_{\mathbf{s}: s(d) = s_0}\omega(\mathbf{s}\mid \lambda)}{\sum_{\lambda'}q(\lambda')\sum_{\mathbf{s}: s(d) = s_0}\omega(\mathbf{s}\mid \lambda')}
\end{align*}
where $\omega(\mathbf{s}\mid \lambda)$ is the probability the signal profile $\mathbf{s}$ is generated given the attacker type is $\lambda$.

Any private signaling scheme induces a Bayesian game among players. According to \citep{Bergemann16Bayes}, all the Bayes Nash equilibria that can possibly arise at any private signaling scheme forms the set of \emph{Bayes correlated equilibrium} (BCEs).  Similar to the standard correlated equilibria, the signals of a private signaling scheme in a BCE can  also be interpreted as \emph{obedient} action recommendations. Therefore, a private signal profile can be represented as $\mathbf{s} = (\{s(d)\}, s(a))$ where $s(d) \in \mathbf{T}$ is the suggested protection target for   defender $d$ and $s(a)\in \mathbf{T}$ is the suggestion of a target to attack for the attacker. With slight abuse of notations, we use $s(-a)$ to represent the set of signals sent to the defenders and $s(-a, -d)$   is the set of signals sent to other defenders except the defender $d$. 

\subsection{An Exponential-Size LP Formulation}
Like typical formulation of optimal correlated equilibrium, optimal private signaling can also be formulated as an exponentially large linear program (LP). Specifically, the principal attempts to find an optimal signaling scheme $\Omega = \{\omega(\mathbf{s}\mid \lambda)\}$ to optimize her objective, which can be either her own utility (if she is a defender) or the social welfare of the defenders. We abstractly represent the principal's objective function w.r.t a signal $\mathbf{s}$ as $U(\mathbf{s})$. The optimal private signaling can be formulated as following LP:
\begin{align}\label{private.origin(1)}
    &\max\;  \sum\nolimits_{\lambda}q(\lambda)\sum\nolimits_{\mathbf{s}} \omega(\mathbf{s}\mid\lambda)U(\mathbf{s}) \text{ s.t. }\\\nonumber
    &\text{(Attacker obedience) }\forall \lambda, t, t': \\\label{private.origin(2)}
    &\sum_{\mathbf{s}:t = s(a)}\!\!\!\! \omega(\mathbf{s}\mid\lambda) U^{\lambda}(\mathbf{s}) \geq\!\! \sum_{\mathbf{s}: t = s(a)}\!\!\!\!\omega(\mathbf{s}\mid\lambda) U^{\lambda}(s(-a), t')\\\nonumber
    &\text{(Defender obedience) }\forall d, t, t':\\\label{private.origin(3)}
    & \sum\nolimits_{\lambda} q(\lambda)\sum\nolimits_{\mathbf{s}: s(d) = t} \omega(\mathbf{s}\mid\lambda) U^{d}(\mathbf{s}) \\\nonumber
    &\geq\sum\nolimits_{\lambda} q(\lambda)\sum\nolimits_{\mathbf{s}: s(d) = t} \omega(\mathbf{s}\mid\lambda) U^{d}( s(-a,-d), t', s(a))\\\label{private.origin(4)}
    & \sum\nolimits_{\mathbf{s}} \omega(\mathbf{s}\mid\lambda) = 1,\omega(\mathbf{s}\mid\lambda) \geq 0,\forall \mathbf{s}, \lambda
\end{align}
where (\ref{private.origin(2)}--\ref{private.origin(3)}) are obedience constraints which guarantee the attacker of any type and all defenders will follow the principal's recommendation. The utilities of each defender $d$ and each attacker type $\lambda$ are determined as follows:
\begin{align*}
    & U^{d}(\mathbf{s}) = C^{d}(s(d)) + P^{d}(s(a)), \text{ if }\forall d': s(a) \neq s(d')\\
    & U^{d}(\mathbf{s}) = C^{d}(s(d)) + R^{d}(s(a)), \text{ if }\exists d': s(a) = s(d')\\
    & U^{\lambda}(\mathbf{s}) = R^{\lambda}(s(a)), \text{ if }\forall d': s(a) \neq s(d')\\
    & U^{\lambda}(\mathbf{s}) = P^{\lambda}(s(a)), \text{ if }\nexists d': s(a) = s(d')
\end{align*}
% In addition, if the principal's objective is the defender social welfare, then we have:
% \begin{align*}
%     & U(\mathbf{s}) = \sum\nolimits_d C^{d}(s(d)) + \sum\nolimits_{d} P^{d}(s(a)), \text{ if }\forall d: s(a) \neq s(d)\\
%     & U(\mathbf{s}) = \sum\nolimits_d C^{d}(s(d)) + \sum\nolimits_{d} R^{d}(s(a)), \text{ if }\exists d: s(a) = s(d)
% \end{align*}
% in which the first term is the total cost of the defenders to go to the recommended targets and the second term is the defenders' total payoff outcome as a result of the attack $s(a)$. 
Problem (\ref{private.origin(1)} -- \ref{private.origin(4)}) has an exponential number of variables $\{\omega(\mathbf{s}\mid\lambda)\}$ due to exponentially many possible defender allocations. This is also the common challenge in computing optimal correlated equilibrium for succinctly represented games with many players (defenders in our case). Indeed, optimal correlated equilibrium is proved to be NP-hard in many succinct games \citep{papadimitriou2008computing}. Perhaps surprisingly, next we show that  LP (\ref{private.origin(1)} -- \ref{private.origin(4)})  can   be solved in a polynomial time in our case.  

\subsection{A Polynomial-Time Algorithm} 
We prove the following main positive result. 
\begin{theorem}\label{thm.0}
The optimal private signaling scheme can be computed in polynomial time.
\end{theorem}

The rest of this section is devoted to the proof of Theorem~\ref{thm.0}. We elaborate the proof for the principal objective of  maximizing the defender social welfare, i.e., $U(\mathbf{s}) = \sum_d U^d(\mathbf{s})$. The proof is similar when the principal is one of the defender. Our proof is divided into three major steps, and crucially exploits the structure of security games. 

% \paragraph{Step 1: Restricting to simplified pure strategy space. } 
\textbf{Step 1: } Restricting to simplified pure strategy space. 
One challenge of designing the signaling scheme is when multiple defenders are recommended a same target, which significantly complicates computation of marginal target protection. Therefore, our first step is to simplify the pure strategy space to include only those in which all defenders cover different targets. To do so,  we create $\mathbf{D}$ \emph{dummy} targets at which rewards and penalties and costs are zero for both defenders and attacker. When the players choose one of these dummy targets, it means they choose to do nothing. As a result, we have $(\mathbf{T} + \mathbf{D})$ targets in total, including these dummy targets. 
The creation of these dummy targets does not influence the actual outcome of any signaling scheme, but introduce a nice characteristic of the optimal signaling scheme (Lemma~\ref{lemma.1}). This characteristic of at most one defender at each target allows us to provide  more efficient algorithms to find an optimal signalling scheme. 
% We can guarantee that the optimal signaling scheme will satisfy this condition by adding an option ($t^{\text{dummy}}$) for a defender or attacker to not go to any target. 


\begin{lemma}\label{lemma.1}
There is an optimal signaling scheme such that for any signal profile $\mathbf{s}$ with a positive probability (i.e., $\omega(\mathbf{s}\mid \lambda) > 0$), then $s(d) \neq s(d')$ for all $d\neq d'$.
\end{lemma}
\begin{proof}
Let's assume in a signaling scheme, there is a signal in which multiple defenders are sent to the same target $t$. We revise this signal by only suggesting the defender $d$ with the lowest cost $C^d(t)$ to $t$ and other defenders are sent to dummy targets instead. First of all, the expected cost will be reduced while the coverage probability at each non-dummy target remains the same. As a result, the principal's objective does not change.  Second, the attacker obedience constraints does not change. Third, the LHS of the defender's obedience constraints increases while the RHS is the same. This means no obedience constraint is violated. 
\end{proof} 

% \paragraph{Step 2: Working in the dual space.}
\textbf{Step 2: } Working in the dual space.
 Since LP (\ref{private.origin(1)} -- \ref{private.origin(4)}) has exponentially many variables, we first reduce it to the following dual linear program  (\ref{private.origin(1)} -- \ref{private.origin(4)}), which turns out to be more tractable to work with:  
\begin{align}\label{private.dual.(1)}
    &\min \; \sum\nolimits_{\lambda} \gamma(\lambda) \text{ s.t. }\\\label{private.dual.(2)}
     &\gamma(\lambda)  + \sum\nolimits_{t'} \left[U^{\lambda}( s(-a), t')  - U^{\lambda}(\mathbf{s})\right]\alpha^{\lambda}( s(a),t') \\\nonumber
    &+ \!q(\lambda)\!\!\sum\nolimits_{d, t'}\!\!\left[U^{d}\!(s(-a,-d), t',s(a)) \!-\! U^{d}(\mathbf{s})\right]\!\beta^{d}( s(d), t') \\\nonumber
    &\geq q(\lambda) U(\mathbf{s}),\forall (\mathbf{s},\lambda)\\\label{private.dual.(3)}
    & \alpha^{\lambda}(t,t'), \beta^{d}( t, t') \geq 0,\forall \lambda, d, t, t'.
\end{align}
where each constraint in (\ref{private.dual.(2)}) corresponds to the primal variable $\omega(\mathbf{s}\mid \lambda)$. The dual variables $\alpha^{\lambda}(t, t')$ correspond to attacker obedience constraints (\ref{private.origin(2)}). The dual variables $\beta^d(t, t')$ correspond to defender obedience constraints (\ref{private.origin(3)}). Finally, the variables $\gamma(\lambda)$ corresponds to constraints (\ref{private.origin(4)}).

Problem (\ref{private.dual.(1)}--\ref{private.dual.(3)}) has an exponential number of constraints. We employ the ellipsoid method \citep{grotschel1981ellipsoid} by designing a polynomial-time separation oracle. In this oracle, given a value of $(\alpha^{\lambda}( t,t'), \beta^{d}( t, t'), \gamma(\lambda))$, it either establishes that this value is feasible for the problem or, if not, it outputs an hyper-plane separating this value from the feasible region. In the following, we focus on a particular type of oracles: those generating violated constraints. The oracle solves the following optimization problems, each corresponds to a fixed $\lambda$ and $s(a)$ (to be some target $t_0$),
\begin{align}\label{private.oracle}
    &\min_{\mathbf{s}: s(a) = t_0} \sum\nolimits_{t'} \left[U^{\lambda}( s(-a), t') - U^{\lambda}( \mathbf{s})\right]\alpha^{\lambda}( t_0,t') \\\nonumber
    &+\!q(\lambda)\sum\nolimits_{d, t'}\left[U^{d}( s(-a,-d), t',t_0) \!-\! U^{d}(\mathbf{s})\right]\beta^{d}( s(d), t')\\\nonumber
    &-q(\lambda) U(\mathbf{s})
\end{align}
If the optimal objective of this problem is \emph{strictly} less than $-\gamma(\lambda)$ for any $(\lambda, t_0)$, it means we found a violated constraint corresponding to $(\mathbf{s}^*, \lambda)$ where $\mathbf{s}^*$ is an optimal solution of (\ref{private.oracle}). We iterate over every $(\lambda, t_0)$ to find all violated constraints and add them to the current constraint set.

% \paragraph{Step 3: Establishing an efficient separation oracle.} 
\textbf{Step 3: } Establishing an efficient separation oracle.
We now solve (\ref{private.oracle}) for any given $(\lambda, t_0)$. We further divide this problem into two sub-problems; each can be solved via bipartite matching (which is polynomial-time). More specifically, we divide the signal set $\{\mathbf{s}: s(a) = t_0\}$ into two different subsets, as elaborated in the following.

% \paragraph{Case 1 of Step 3: Attacked target is not covered. }
\textbf{Case 1 of Step 3: } Attacked target is not covered. 
The first subset consists of all signals such that $t_0\notin s(-a)$, that is, none of the defender is assigned to $t_0$. In this case, the attacker will receive a reward $R^{\lambda}(t_0)$ for attacking $t_0$ while every defender $d$ receives a penalty $P^d(t_0)$. Thus, each of the following elements in (\ref{private.oracle}) is straightforward to compute:
% \[ U^{\lambda}( s(-a), t') - U^{\lambda}( s) =
%   \begin{cases}
%     P^{\lambda}(t') - R^{\lambda}(t_0)        & \text{if } t'\in s(-a)\\
%     R^{\lambda}(t') - R^{\lambda}(t_0)  &  \text{if } t'\notin s(-a)
%   \end{cases}
%  \]
\begin{align*}
    & U^{\lambda}( s(-a), t') - U^{\lambda}( \mathbf{s}) =
  \begin{cases}
    P^{\lambda}(t') \!-\! R^{\lambda}(t_0)        & \text{if } t'\!\in\! s(-a)\\
    R^{\lambda}(t') \!-\! R^{\lambda}(t_0)  &  \text{if } t'\!\notin\! s(-a)
  \end{cases}\\
    &U^{d}( s(-a,-d), t',t_0) - U^{d}(\mathbf{s})\\\nonumber
    &=\begin{cases}
    C^{d}(t') - C^{d}(s(d))        & \text{if } t'\neq t_0\\
    R^d(t_0) +C^{d}(t_0) \!-\! P^d(t_0) \!-\! C^{d}(s(d))  &  \text{if } t'=t_0
  \end{cases}\\
  & U(\mathbf{s}) = \sum\nolimits_d P^{d}(t_0) + C^{d}(s(d))
\end{align*}
Given the above computation, we observe that the second and third components (in the second and third lines) of the objective (\ref{private.oracle}), which only depends on the defender utilities, consists of multiple terms --- each term depends only on the allocation of each individual defender $(d, s(d))$. On the other hand, the first component (in the first line) of the objective, which depends on the attacker's utility, has terms which depends on targets not in the defender allocation.  Therefore, in order to create a corresponding bipartite matching problem, we introduce $|\mathbf{T}|$ new dummy defenders and the following weights between $|\mathbf{T}|+ |\mathbf{D}|$ defenders and $|\mathbf{T}| + |\mathbf{D}|$ targets: 
% \begin{align*}
%     &\eta(d, t) = q(\lambda)\!\!\sum\nolimits_{d, t'}\!\!\left[U^{d}\!( s(-a,-d), t',t_0) \!-\! U^{d}(\mathbf{s})\right]\beta^{d}( t, t')\\
%     &\quad-q(\lambda) C^d(t)\!+\! [P^{\lambda}(t) \!-\! R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t), \forall t\neq t_0, d \leq |\mathbf{D}|\\
%     &\eta(d, t_0) = +\infty, \forall d \leq |\mathbf{D}|
% \end{align*}
\begin{align*}
    &\eta(d, t) = q(\lambda)\!\!\sum\nolimits_{ t'\neq t_0}\!\!\left[C^d(t')\! -\! C^d(t)\right]\beta^{d}( t, t')\\
    & + q(\lambda)\!\!\sum\nolimits_{t'= t_0}\!\!\left[R^d(t_0) \!+\! C^d(t_0) \!-\!P^d(t_0) \!-\!C^d(t))\right]\beta^{d}( t, t')\\
    &+[P^{\lambda}(t) \!-\! R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t)\!-\!q(\lambda) C^d(t), \forall t\neq t_0, d \leq |\mathbf{D}|\\
    &\eta(d, t_0) = +\infty, \forall d \leq |\mathbf{D}|
\end{align*}
\begin{align*}
    % &\eta(d, t_0) = +\infty, \forall d \leq |\mathbf{D}|\\
    &\eta(d, t) = [R^{\lambda}(t) - R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t), \forall t, \text{ dummy }d >  |\mathbf{D}|
\end{align*}
Weights associated with these dummy defenders correspond to the terms in (\ref{private.oracle}) which depends on targets not in the actual defender allocation. The weight $\eta(d,t_0) = +\infty$ is to ensure that no actual defender in $\mathbf{D}$ will be assigned to $t_0$.
% \begin{align*}
%     &\eta(d, t) \!=\! q(\lambda)[R^d(t_0) \!+\!C^{d}(t_0) \!-\! P^d(t_0) \!-\! C^{d}(t)]\beta^{d} (t, t_0)\\
%     & \!+\! q(\lambda)\!\!\sum_{t'\neq t_0} [C^{d}(t') \!-\! C^d(t)] \beta^{d}(t, t')\!+\! [P^{\lambda}(t) \!-\! R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t)\\
%     &-q(\lambda) C^d(t), \forall t\neq t_0\\
%     &\eta^{d}(t_0) = \infty
% \end{align*}
% for all $d\in\mathbf{D}$.
% We also introduce $|\mathbf{T}|$ new dummy defenders: 
% \begin{align*}
%     \eta(d, t) = [R^{\lambda}(t) - R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t), \forall |\mathbf{T}| + |\mathbf{D}| \geq d >  |\mathbf{D}|
% \end{align*}
% Finally, we obtain the following lemma.

We now present Lemma~\ref{lemma.3} (which can be proved via a couple of algebra computation steps), showing that Problem (\ref{private.oracle}) becomes a Minimum Bipartite Matching between $|\mathbf{T}|+ |\mathbf{D}|$ defenders and $|\mathbf{T}| + |\mathbf{D}|$ targets.
\begin{lemma}\label{lemma.3}
The problem (\ref{private.oracle}) can be now reduced to as the following bipartite matching problem using $\eta(d, t)$: 
% \begin{align*}
%     & \min_{\mathbf{m}}- q(\lambda) \sum_{d\in \mathbf{D}} P^{d}(t_0) + \sum_{1\leq d\leq |\mathbf{T}| + |\mathbf{D}| } \eta(d, m(d)) 
% \end{align*}
\begin{align*}
    & \min_{\mathbf{m}} \sum\nolimits_{d } \eta(d, m(d)) 
\end{align*}
after removing the constant term $- q(\lambda) \sum_{d\in\mathbf{D}} P^{d}(t_0)$ in (\ref{private.oracle})). Here, $m(d)$ is a target matched to the defender $d$. 
\end{lemma}
% \begin{proof}
% The proof can be done via some algebra computation steps. In particular, given that $t_0\notin s(-a)$, the objective in (\ref{private.oracle}) can be reformulated as follows:
% \begin{align*}
%     &\sum\nolimits_{t'\in s(-a)} [P^{\lambda}(t') - R^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t') \\
%     &+ \sum\nolimits_{t'\notin s(-a)} [R^{\lambda}(t') - R^{\lambda}(t_0)] \alpha^{\lambda}( t_0, t')\\
%     & + q(\lambda)\sum\nolimits_{d, t'\neq t_0} [C^{d}(t') - C^{d}(s(d))] \beta^{d}(s(d), t')\\
%     &+ q(\lambda)\sum\nolimits_d [R^d(t_0) +C^{d}(t_0) \!-\! P^d(t_0) \!-\! C^{d}(s(d))]\beta^{d}(s(d), t_0) \\
%     &- q(\lambda) [\sum\nolimits_d P^{d}(t_0) + C^{d}(s(d))]
% \end{align*}
% which can be decomposed into multiple terms as stated in the lemma, concluding our proof.
% \end{proof}
% In this case, we create a corresponding bipartite matching problem of which optimal solution is also the optimal solution of (\ref{private.oracle}). Specifically, 

% \paragraph{Case 2 of Step 3: Attacked target is covered.}
\textbf{Case 2 of Step 3: } Attacked target is covered.
On other other hand, the second subset consists of all signals such that $t_0$ is assigned to one of the defender. In this case, we further divide this sub-problem into multiple smaller problems, by fixing the defender who covers $t_0$, denoted by $d_0$. Similar to \textit{Sub-problem P1}, %\textbf{Sub-problem P1}
we introduce the following weights: $\forall t$
\begin{align*}
    &\eta(d, t) =   q(\lambda)\sum\nolimits_{t'} [C^{d}(t') \!-\! C^d(t)]\beta^{d}(t, t')\\
    % & - q(\lambda) C^d(t)\\
    & + [P^{\lambda}(t) \!-\! P^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t) - q(\lambda) C^d(t),\forall t, \forall d \in \mathbf{D}\setminus \{d_0\}\\
    &\eta(d, t) = [R^{\lambda}(t) - P^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t),\forall t,  \text{ dummy } d >  |\mathbf{D}|
\end{align*}
\begin{lemma}
The problem (\ref{private.oracle}) can be now reduced to as the following bipartite matching problem using $\eta(d, t)$:
\begin{align*}
    \min_{\mathbf{m}} \sum\nolimits_{d\neq d_0} \eta(d, m(d))
\end{align*}
after removing the constant terms $\sum_{t'\neq t_0} [P^{d_0}(t_0) + C^{d_0}(t') - R^{d_0}(t_0) - C^{d_0}( t_0)]\beta^{d_0}(t_0, t') - q(\lambda) \sum_d R^{d}(t_0)$. In addition, $(d_0, t_0)$ is removed from our matching setting.
\end{lemma}
% \begin{proof}
% The proof can be done via some algebra computation steps. Given $s(d_0) = t_0$, the objective of the separation oracle problem (\ref{private.oracle}) is reformulated as follows:
% \begin{align*}
%     & \sum_{t'\in s(-a)} \!\!\!\![P^{\lambda}(t') - P^{\lambda}(t_0) ] \alpha^{\lambda}(t_0,t') + \!\!\!\!\!\!\sum_{t'\notin s(-a)} \!\!\!\![R^{}(t') - P^{\lambda}(t_0)] \alpha^{\lambda}( t_0, t')\\
%     & + q(\lambda)\sum\nolimits_{d\neq d_0} \sum\nolimits_{t'} [C^{d}(t') - C^{d}(s(d))]\beta^{d}(s(d), t')\\
%     & + q(\lambda)\!\!\sum_{t'\neq t_0} [P^{d_0}(t_0) + C^{d_0}(t') - R^{d_0}(t_0) - C^{d_0}( t_0)]\beta^{d_0}(t_0, t') \\
%     &- q(\lambda) [\sum\nolimits_d R^{d}(t_0) + C^{d}(s(d))]
% \end{align*}
% which can be decomposed into multiple terms as stated in the lemma, concluding our proof.
% \end{proof}
% As a result, we create the weights as follows: 
% The problem (\ref{private.oracle}) can be reduced to the following 
We now have the problem of a Minimum Bipartite Matching between $|\mathbf{T}|+ |\mathbf{D}| - 1$ defenders to $|\mathbf{T}| + |\mathbf{D}| - 1$ targets, which can be solved in a polynomial time.
% \paragraph{Remark.} 


% A tricky part is that target $t^{\text{dummy}}$ can be assigned to multiple defenders. In this case, we can create $|\mathbf{D}|$ such dummy targets so that we can still do the bipartite matching.




% the first component (with respect to the attacker's utility) of this objective can be determined as follows: for all $t'\neq t =s(a)$
% \begin{align}\nonumber
% &\text{if }\nexists d': t = s(d'), \exists d': t'=s(d')\text{ then: }\\
%     & U^{\lambda}( \{s(d)\}, t') - U^{\lambda}( s) = R(\lambda, t') - P^{\lambda}(t) \\\nonumber
%     &\text{if }\nexists d': t = s(d'), \nexists d': t'=s(d')\text{ then: }\\
%     & U^{\lambda}( \{s(d)\}, t') - U^{\lambda}( s) = R(\lambda, t') - R^{\lambda}(t) \\\nonumber
%     &\text{if }\exists d': t = s(d'), \exists d': t'=s(d')\text{ then: }\\
%     & U^{\lambda}( \{s(d)\}, t') - U^{\lambda}( s) = P(\lambda, t') - R^{\lambda}(t) \\\nonumber
%     &\text{if }\exists d': t = s(d'), \nexists d': t'=s(d')\text{ then: }\\
%     & U^{\lambda}( \{s(d)\}, t') - U^{\lambda}( s) = P(\lambda, t') - R^{\lambda}(t) 
% \end{align}
% On the other hand, we can compute: for all $t'\neq s(d)$
% \begin{align}\nonumber
%     &\text{ if }t \in s(-d) \text{ then}\\
%     &U^{d}( s(-d), t',t) - U^{d}(s)=C(d, t') - C(d, s(d)) \\\nonumber
%     &\text{ if }t \notin s(-d) \text{ then }\\
%     &U^{d}( s(-d), t',t) - U^{d}(s) =C(d, t')- C(d, s(d))
% \end{align}
% Observing that the first component of the objective only depends on the attacked target and whether any defender protects that target or not. Each coefficient of the second component regarding a defender $d$ depends on to where $d$ is sent (in addition to the attacked target).
% Therefore, we 
% % can reformulate the above minimization problem as follows:
% % \begin{align}
% %     &\min_{s: s(a) = t} \sum_{t'}\delta U^{\lambda}( t, t')\cdots\alpha^{\lambda}( t,t') \\\nonumber
% %     &+ \sum_{d, t'}\left[U^{d}( s(-d), t',t) - U^{d}(s)\right]\beta^{d}( s(d), t') +  \gamma(\lambda)
% % \end{align}
% present the following polynomial-time algorithm to optimally solve the above minimization problem. 

% Essentially, 
\section{Optimal Ex Ante Private Signaling}
This section relaxes the private signaling requirement and assumes that players make decision on whether to follow signals or not before any signal is sent. 
Such ex ante private signaling has been studied recently in routing \citep{castiglioni2021signaling} and abstract games \citep{xu2020tractability}. However, both works have used the ellipsoid algorithm to compute the optimal scheme. While the  ellipsoid algorithm is theoretically efficient, as we will show in our experiments, they are practically quite slow. In our case, we could have also  just employed similar technique. However, we take one step further and present a novel idea of using compact representation of the signaling schemes such that the ``reduced'' signaling space become  polynomial size in the number of targets. This important result helps in significantly scaling up the problem computation. 
% It  also strictly generalizes the  mixed strategy sampling approach in security games \citep{kiekintveld2009computing}, and may be of independent interest.  

% In this section, we first describe the basic formulation of computing an optimal ex ante signaling scheme. We then present the ellipsoid method to solve the problem in a polynomial time. Finally, we elaborate on our compact representation result. 
\subsection{An Exponential-Size LP Formulation}
Overall, the problem of finding an optimal ex ante private signaling scheme can be formulated as the following LP which has an exponential number of variables $\{\omega(\mathbf{s}\mid\lambda)\}$:
\begin{align}\label{exante.primal.(1)}
    &\max\;  \sum\nolimits_{\lambda}q(\lambda)\sum\nolimits_{\mathbf{s}} \omega(s\mid\lambda)U(\mathbf{s}) \text{ s.t. }\\\nonumber
    &\text{(Attacker obedience) }\forall \lambda,t': \\\label{exante.primal.(2)}
    &\sum\nolimits_{s} \omega(\mathbf{s}\mid\lambda) U^{\lambda}(\mathbf{s})\! \geq\!\sum\nolimits_{\mathbf{s}}\omega(\mathbf{s}\mid\lambda) U^{\lambda}( s(-a), t')
    \end{align}
    \begin{align}\nonumber
    &\text{(Defender obedience) }\forall d, t':\\\label{exante.primal.(3)}
    & \sum\nolimits_{\lambda} q(\lambda)\sum\nolimits_{\mathbf{s}} \omega(\mathbf{s}\mid\lambda) U^{d}(\mathbf{s}) \\\nonumber
    &\geq\sum\nolimits_{\lambda} q(\lambda)\sum\nolimits_{\mathbf{s}} \omega(\mathbf{s}\mid\lambda) U^{d}(s(-d), t', s(a))\\\label{exante.primal.(4)}
    & \sum\nolimits_{\mathbf{s}} \omega(\mathbf{s}\mid\lambda) = 1,\forall \lambda, \omega(\mathbf{s}\mid\lambda) \geq 0,\forall \mathbf{s}, \lambda
\end{align}
Similar to private signaling, we show that the optimal ex-ante signaling scheme can be computed in a polynomial time (Theorem~\ref{thm.1}) by developing an ellipsoid algorithm. 
\begin{theorem}\label{thm.1}
The optimal private ex-ante signaling scheme can be computed in a polynomial time.
\end{theorem}
\subsection{Compact Signaling Representation}
As we mentioned previously,  while the ellipsoid algorithm is theoretically efficient, they run slowly in practice. Therefore, we further show that in this scenario, we can provide a compact representation of signaling schemes such that the signaling space is polynomial in the number of targets. This immediately leads to a polynomial time algorithm for optimal ex ante private signaling by directly solving the polynomial-size linear program.  
Give any signaling scheme $\omega$, we introduce the new variable $\omega(a\rightarrow t, d\rightarrow t'\mid \lambda)$ which is the \emph{marginal} probability that the attacker is sent to target $t$ and the defender $d$ is sent to target $t'$, given that the attacker type is $\lambda$. In addition, we introduce $\omega(a\rightarrow t\mid \lambda)$ which is the probability the attacker is sent to $t$. Reformulating  (\ref{exante.primal.(1)}-\ref{exante.primal.(3)}) based on these new variables is straightforward. For example, the objective (\ref{exante.primal.(1)}) is reformulated as following:
\begin{align*}\small
    &\sum\nolimits_{\lambda}q(\lambda) \sum\nolimits_{t, d} \omega(a\rightarrow t, d\rightarrow t \mid \lambda)R^{d}(t) \\
    & + \sum\nolimits_{\lambda}q(\lambda)\sum\nolimits_t \big[\omega(a\rightarrow t\mid\lambda) \\
    &- \sum\nolimits_d\omega(a\rightarrow t, d\rightarrow t\mid\lambda)\big] [\sum\nolimits_dP^{d}(t)] \\
    & + \sum\nolimits_{\lambda}q(\lambda) \sum\nolimits_{t',d} \big[\sum\nolimits_{t} \omega(a\rightarrow t , d\rightarrow t'\mid\lambda)\big] C^{d}(t')
\end{align*}

% \chenghan{ The first term is the probability of one of the defenders catching the attacker: 
% \begin{align*}
%     &\sum_{\lambda}q(\lambda ) \sum_{t} \left[ \sum_{d'} \omega(a\rightarrow t , d' \rightarrow t\mid \lambda) \right] R^{d}(t) \\
%     & + \sum_{\lambda}q(\lambda)\sum_t \big[\omega(a\!\rightarrow\! t\mid\lambda) \!-\! \sum_{d'}\omega(a\!\rightarrow\! t, d' \!\rightarrow\! t\mid\lambda)\big] P^{d}(t) \\
%     & + \sum_{\lambda}q(\lambda) \sum_{t''} \big[\sum_{t} \omega(a\rightarrow t, d\rightarrow t'' \mid\lambda)\big] C^{d}(t'')\\
%     &\geq \sum_{\lambda} q(\lambda) \omega(a\rightarrow t'\mid\lambda) R^{d}(t')\\
%     & + \sum_{\lambda} q(\lambda)\sum_{t\neq t'} \big[ \sum_{d'\neq d} \omega(a\rightarrow t, d'\rightarrow t\mid\lambda)\big] R^{d}(t)\\
%     &+\sum_{\lambda }q(\lambda)\sum_{t\neq t'} \omega(a\rightarrow t, d\rightarrow t\mid\lambda) P^{d}(t)\\
%     & +\sum_{\lambda }\!q(\lambda)\!\sum_{t\neq t'} \big[\omega(a\!\rightarrow\! t\mid\lambda) \!-\! \sum_{d'} \omega(a\!\rightarrow\! t, d'\!\rightarrow\! t)\mid\lambda)\big] P^{d}(t)\\
%     & + C^{d}(t')
% \end{align*}
% }
The crux of this section is the following theorem. It fully characterize the conditions under which the compact representation corresponds to a feasible ex ante signaling scheme. 
\begin{theorem}
The following conditions are necessary and sufficient conditions to generate a feasible ex ante signaling scheme from a compact representation $(\omega(a\rightarrow t\mid \lambda), \omega(a\rightarrow t, d\rightarrow t'\mid \lambda))$:
\begin{align}
    % &\sud_t \omega(I(m, i) = 1\mid\lambda) = 1\\
    % & \sum_m \omega(I(m, i) = 1\mid\lambda) \leq 1\\
    &\sum\nolimits_t \omega(a\rightarrow t\mid\lambda) = 1,\forall \lambda\\
    & \sum\nolimits_{t'} \omega(a\rightarrow t, d\rightarrow t'\mid\lambda) = \omega(a\rightarrow t\mid\lambda),\forall \lambda,  d\\
    & \sum\nolimits_{d} \omega(a\rightarrow t, d\rightarrow t'\mid\lambda) \leq \omega(a\rightarrow t\mid\lambda),\forall \lambda,  t'\\
    & \omega(a\!\rightarrow\! t\mid\lambda)\!\geq\! 0, \omega(a\!\rightarrow\! t, d\!\rightarrow\! t'\mid\lambda)\!\geq \!0,\forall \lambda, t,d,t'
\end{align}
\end{theorem}
\begin{proof}
It is obvious that these conditions are necessary conditions. Let's consider $\{\omega(a\rightarrow t\mid\lambda)\}$ and $\omega(a\rightarrow t,d\rightarrow t'\mid\lambda)$ satisfying these conditions. We will show that these correspond to a feasible signaling scheme. First, we have:
\begin{align*}
    & \omega(d\rightarrow t'\mid a\rightarrow t,\lambda) = \frac{\omega(a\rightarrow t, d\rightarrow t'\mid\lambda)}{\omega(a\rightarrow t\mid \lambda)}
\end{align*}
which is the probability of assigning defender $d$ to target $t'$ given the attacker is of type $\lambda$ and is assigned to target $t$. By fixing $\lambda$ and $a\rightarrow t_0$, we use $\omega(d\rightarrow t')$ as an abbreviation of $\omega(d\rightarrow t'\mid a\rightarrow t_0,\lambda)$ when the context is clear. We will prove that any $\{\omega(d\rightarrow t)\}$ satisfying the following conditions correspond to a feasible signaling scheme:
\begin{align*}
    & \sum\nolimits_{t} \omega(d\rightarrow t) = 1, \forall d\\
    & \sum\nolimits_{d} \omega(d\rightarrow t) \leq 1, \forall t
\end{align*}
In order to do so, we introduce the following general lemma:
\begin{lemma}\label{lemma.2}
For any a coverage vector $\{\omega(d, t)\}$ such that: 
\begin{align}\label{res.1}
    & \sum\nolimits_t \omega(d, t) = r\\\label{res.2}
    & \sum\nolimits_d \omega(d, t) \leq r,
\end{align}
given $0 \leq r \leq 1$,
there is an assignment of defenders to targets, denoted by $(d_1, t_1),\dots (d_{|\mathbf{D}|}, t_{|\mathbf{D}|})$, such that:\footnote{Since we have $(\mathbf{T} + \mathbf{D})$ targets in total while there are only $|\mathbf{D}|$ defenders, some targets will not be assigned to any defenders.}
\begin{itemize}
    \item $\omega(d_i, t_i) >0$ for all $i \in \{1,\dots, |\mathbf{D}|\}$
    \item Every maximally-covered target $t$, i.e., $\sum_d \omega(d, t) = r$, is  assigned to a defender, that is, $t \in \{t_1,\dots, t_{|\mathbf{D}|}\}$. 
\end{itemize}
\end{lemma}
\begin{proof}
Let $\mathbf{D}(t) \!=\! \{d\!:\! \omega(d, t) \!>\! 0\}$ the support defender set of target $t$. Similarly, we also denote by $\mathbf{T}(d) \!=\! \{t\!:\! \omega(d,t) \!>\!0\}$ the support target set of defender $d$. We divide the set of targets into two groups: (i) Group of all maximally-covered targets $\mathbf{T}^{\text{high}} \!=\! \{t\!:\! \sum_d \omega(d, t)\} \!=\! r$; and (ii) Group of other targets $\mathbf{T}^{\text{low}} \!=\! \{t\!:\! \sum_d \omega(d, t)\!<\! r\} $. W.l.o.g, we represent $\mathbf{T}^{\text{high}} \!=\! \{t_1,\dots, t_H\}$ and $\mathbf{T}^{\text{low}} \!=\! \{t_{H+1},\dots, t_{|\mathbf{T}|+|\mathbf{D}|}\}$ where $\{t_i\}$ is a permutation of targets $\{1,\dots, |\mathbf{T}|+|\mathbf{D}|\}$.

% \paragraph{Inclusion of high-coverage target group $\mathbf{T}^{\text{high}}$. }
\textbf{Step 1: } Inclusion of high-coverage target group $\mathbf{T}^{\text{high}}$. 
We first prove that there is a partial allocation from defenders to targets in $\mathbf{T}^{\text{high}}$, denoted by $(d_1,\dots, d_{H})$ such that $d_i\in \mathbf{D}(t)$ for all $t_i\in \mathbf{T}^{\text{high}}$ and they are pair-wise different, i.e., $d_i\!\neq\! d_{j}$ for all $t_i\!\neq\! t_j\!\in\! \mathbf{T}^{\text{high}}$. We use induction w.r.t $t$. 

In the base, $t= 1$, the above statement holds true. Let's assume this statement is true for some $t < |\mathbf{T}^{\text{high}}|$. We will prove that it is also true for $t + 1$. Let's denote by $(d_1, 1),\dots, (d_t, t)$ the current sequence of defender-to-target assignment. At target $t+1$, if there is $d\in\mathbf{D}(t+1)$ such that $d\neq d_j$ for all $j\leq t$, then we obtain a new satisfactory partial assignment $\{(d_1, 1),\dots, (d_t, t), (d, t+1)\}$. 

Conversely, if $\mathbf{D}(t+1)\subseteq \{d_1,\dots, d_t\}$, w.l.o.g, we assume $\mathbf{D}(t+1) \!=\! \{d_1,d_2,\dots, d_{t'}\}$ for some $t'\!\leq\! t$. We obtain:
\begin{observation}
There exists a target $t_0 \leq t$ and a defender $d_0\notin \{d_1,\dots, d_t\}$ such that $\omega(d_0\rightarrow t_0) >0$.
\end{observation}
% We observe that there must exist a target $t_0 \leq t$ and a defender $d_0\notin \{d_1,\dots, d_t\}$ such that $\omega(d_0\rightarrow t_0) >0$. \chenghan{The previous sentence might be highlighted as a lemma.} 
Indeed, if there is no such $(d_0, t_0)$, it means all targets $\{1,\dots, t+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}$. As a result, we will have:
\begin{align}\nonumber
    r\!\times\!(t\!+\!1) \!=\!\! \sum_{j'= 1}^{t+1}\sum_{d\in\mathbf{D}(j')}\!\!\!\!\!\omega(d, j')&\!\leq\! \sum_{j' = 1}^t\sum_{j''\in\mathbf{T}(d_{j'})}\!\!\!\!\!\!\omega(d_{j'}, j'')\\\label{ineq.1}
    =r\times t &\text{ (contradiction)}
\end{align}
Now, if that target $t_0\leq t'$, then we obtain a new partial assignment $\{\dots, (d_0, t_0),\dots, (d_{t_0}, t+1)\}$ by assigning $d_0$ to target $t_0$ and reallocating $d_{t_0}$ to $t+1$ while keeping other assignments the same. On the other hand, if $t'< t_0 \leq t$, it means $\mathbf{D}(j)\subseteq \{d_1,\dots, d_t\}$ for all $j \leq t'$. 
W.l.o.g, let's assume that target $t_0 = t' + 1$. We observe that there must exist a target $t_{00}\in \{1,\dots, t\}\setminus \{t'+1\}$ and a defender $d_{00}\notin \{d_1,\dots, d_t\}\setminus \{d_{t'+1}\}$ such that $\omega(d_{00}\rightarrow t_{00}) > 0$. Indeed, if there is no such $(d_{00}, t_{00})$, it means all targets $\{1, \dots, t+1\}\setminus \{t'+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}\setminus \{d_{t'+1}\}$. As a result, we have:
\begin{align*}
    r\times t &= \sum_{j' = 1, j'\neq t'+1}^{t+1}\sum_{d\in\mathbf{D}_{j'}} \omega(d, j') \\
    &\leq \sum_{j' = 1, j'\neq t'+1}^t\sum_{j''\!\in\mathbf{T}(d_{j'})}\!\!\!\!\omega(d_{j'}, j'') \\
    &=r\!\times\! (t\! -\!1) \text{ (contradiction)}
\end{align*}
Now, if that target $t_{00} \leq t'$ and $d_{00} = d_{t'+1}$, then we can do the swap $(d_{t'+1}, t_{00}), (d_{t_{00}}, t+1), (d_{0}, t'+1)$ while keeping other assignments the same. If that target $t_{00}\leq t'$ and $d_{00}\neq d_{t'+1}$, then we can do a different swap $(d_{t_{00}}, t+1), (d_{00}, t_{00})$. Finally, if $t_{00} > t'+1$, w.l.o.g, we assume $t_{00} = t' + 2$. We repeated the same above analysis process until at some point, we either already found a feasible assignment or would reach the following situation:
\begin{itemize}
    \item $\exists d_0\notin \{d_1,\dots,d_t\}$ s.t $\omega(d_0\rightarrow t'+1) >0$
    \item $\exists d_{00}\notin \{d_1,\dots,d_t\}\setminus \{d_{t'+1}\}$ s.t. $\omega(d_{00}\rightarrow t'+2) >0$
    \item $\dots$
    \item $\exists d_{\text{final}}\notin \{d_1,\dots, d_{t'}\}$  and $\exists t_{\text{final}}\in \{1,\dots, t'\}$ such that $\omega(d_{\text{final}}\rightarrow t_{\text{final}}) > 0$ where $\text{final} = [0]^{t - t'}$.
\end{itemize}
In this situation, we first swap $(d_{t_{\text{final}}}, t+1),(d_{\text{final}}, t_{\text{final}})$. There are two cases. If $d_{\text{final}}\notin \{d_1,\dots, d_t\}$, then we found a solution. If $d_{\text{final}}$ is equal to some $d_{t' + j}$ for some $j \leq t-t'$, we then reassign $(d_{[0]^{t'+j}}, t'+j)$. At this step, there are two cases again. That is either $d_{[0]^{t'+j}} \notin \{d_1,\dots, d_t\}$ or  $d_{[0]^{t'+j}}$ is one of $\{d_{t'+1},\dots, d_{t'+j - 1}\}$. The former case means we found a solution while the latter case indicates we have to do the reassignment again for a target in $\{t'+1,\dots, t'+j -1\}$. Observe that, every time we have to do a reassignment, the index of the target for the reassignment is decreased. In the end, it will reach target $t'+1$ for which we can reassign $d_0\notin \{d_1,\dots, d_t\}$ and obtain a feasible solution.

% \paragraph{Extension to include target group $\mathbf{T}^{\text{low}}$.}
\textbf{Step 2: } Extension to include target group $\mathbf{T}^{\text{low}}$.
We are going to prove that there is an assignment from defenders $\mathbf{D}$ to $|\mathbf{D}|$ targets, including all targets in $\mathbf{T}^{\text{high}}$. We apply induction with respect to the defender $d$. Note that we cannot apply induction with respect to the targets $t$ since we include target group $\mathbf{T}^{low}$ in this analysis and as a result, the equality on the LHS of (\ref{ineq.1}) no longer holds.

In the base, we start with the feasible assignment of the group $\mathbf{T}^{\text{high}}$. Then at each induction step, we perform a defender-target swapping process which is similar to the case of high-coverage target group $\mathbf{T}^{\text{high}}$. 
The tricky part is that for any swapping, we do not get rid of any targets that have been assigned so far (besides changing the defender assigned to them). It means that in the final assignment of the induction process, denoted by $(1, t_1),\dots, (|\mathbf{D}|, t_{|\mathbf{D}|})$, all targets in $\mathbf{T}^{\text{high}}$ are still included. 
\end{proof}




% \paragraph{Characteristics of high-coverage target group $\mathbf{T}^{\text{high}}$.} 
 

% \paragraph{First, $\sum_d \omega(d, t) = 1$ for all targets $t$.} This is the case when every target is protected all the time. We will prove that there exist a complete assignment $\{(d_t, t)\}$ where $t =1\dots |\mathbf{T}|$ such that $d_t\in \mathbf{D}(t)$ and $\{(d_t, t)\}$ are pair-wise different using induction with respect to $t$. In the base case when $t = 1$, this statement holds true. Let's assume that this statement holds true for some $t$. We are going to prove that it is true for $t+1$ as well. Indeed, let's denote $\{(d_1, 1),\dots, (d_t, t)\}$ the current sequence of defender-to-target assignment. At target $t+1$, if there is $d\in\mathbf{D}(t+1)$ such that $d\neq d_j$ for all $j\leq t$, then we obtain a new satisfactory partial assignment $\{(d_1, 1),\dots, (d_t, t), (d, t+1)\}$. 

% Conversely, if $\mathbf{D}(t+1)\subseteq \{d_1,\dots, d_t\}$, without loss of generality, we assume $\mathbf{D}(t+1) = \{d_1,d_2,\dots, d_{t'}\}$ for some $t'\leq t$. 
% We observe that there must exist a target $t_0 \leq t$ and a defender $d_0\notin \{d_1,\dots, d_t\}$ such that $\omega(d_0\rightarrow t_0) >0$. 
% Indeed, if there is no such $(d_0, t_0)$, it means all targets $\{1,\dots, t+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}$. As a result, we will have:
% \begin{align*}
%     &t+1 = \sum_{j'= 1}^{t+1}\sum_{d\in\mathbf{D}(j')}\omega(d, j')\leq \sum_{j' = 1}^t\sum_{j''\in\mathbf{T}(d_{j'})}\omega(d_{j'}, j'')=t
% \end{align*}
% which is contradictory.

% Now, if that target $t_0\leq t'$, then we obtain a new partial assignment $\{\dots, (d_0, t_0),\dots, (d_{t_0}, t+1)\}$ by assigning $d_0$ to target $t_0$ and reallocating $d_{t_0}$ to $t+1$ while keeping other assignments the same. On the other hand, let's consider the case when $\mathbf{D}(j)\subseteq \{d_1,\dots, d_t\}$ for all $j \leq t'$. That means $t'< t_0 \leq t$.
% % We observe that there must exist a target $j > t', j \leq t$ and a defender $d\notin \{d_1,\dots, d_t\}$ such that $\omega(d\rightarrow j) >0$. Indeed, if there is no such $(d, j)$, it means all targets $\{1,\dots, t+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}$. As a result, we will have:
% % \begin{align*}
% %     &t+1 = \sum_{j'= 1}^{t+1}\sum_{d\in\mathbf{D}(j')}\omega(d, j')\leq \sum_{j' = 1}^t\sum_{j''\in\mathbf{T}(d_{j'})}\omega(d_{j'}, j'')=j
% % \end{align*}
% % which is contradictory. 
% W.l.o.g, let's assume that target $t_0 = t' + 1$. We observe that there must exist a target $t_{00}\in \{1,\dots, t\}\setminus \{t'+1\}$ and a defender $d_{00}\notin \{d_1,\dots, d_t\}\setminus \{d_{t'+1}\}$ such that $\omega(d_{00}\rightarrow t_{00}) > 0$. Indeed, if there is no such $(d_{00}, t_{00})$, it means all targets $\{1, \dots, t+1\}\setminus \{t'+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}\setminus \{d_{t'+1}\}$. As a result, we will have:
% \begin{align*}
%     &t = \sum_{j' = 1, j'\neq t'+1}^{t+1}\sum_{d\in\mathbf{D}_{j'}} \omega(d, j') \\
%     &\leq \sum_{j' = 1, j'\neq t'+1}^t\sum_{j''\in\mathbf{T}(d_{j'})}\omega(d_{j'}, j'') = t -1
% \end{align*}
% which is contradictory.

% Now, if that target $t_{00} \leq t'$ and $d_{00} = d_{t'+1}$, then we can do the swap $(d_{t'+1}, t_{00}), (d_{t_{00}}, t+1), (d_{0}, t'+1)$ while keeping other assignments the same. If that target $t_{00}\leq t'$ and $d_{00}\neq d_{t'+1}$, then we can do a different swap $(d_{t_{00}}, t+1), (d_{00}, t'+1)$. Finally, if $t_{00} > t'+1$, w.l.o.g, we assume $t_{00} = t' + 2$. We repeated the same above analysis process until we obtain a feasible complete assignment. 
% Specifically, at some point we already found a feasible assignment or would reach the following situation:
% \begin{itemize}
%     \item $\exists d_0\notin \{d_1,\dots,d_t\}$ s.t $\omega(d_0\rightarrow t'+1) >0$
%     \item $\exists d_{00}\notin \{d_1,\dots,d_t\}\setminus \{d_{t'+1}\}$ s.t. $\omega(d_{00}\rightarrow t'+2) >0$
%     \item $\dots$
%     \item $\exists d_{\text{final}}\notin \{d_1,\dots, d_{t'}\}$  and $\exists t_{\text{final}}\in \{1,\dots, t'\}$ such that $\omega(d_{\text{final}}\rightarrow t_{\text{final}}) > 0$ where $\text{final} = [0]^{t - t'}$.
% \end{itemize}
% In this situation, we first swap $(d_{t_{\text{final}}}, t+1)(d_{\text{final}}, t_{\text{final}})$. There are two cases. If $d_{\text{final}}\notin \{d_1,\dots, d_t\}$, then we found a solution. If $d_{\text{final}}$ is equal to some $d_{t' + j}$ for some $j \leq t-t'$, we then reassign $(d_{[0]^{t'+j}}, t'+j)$. At this step, there are two cases again. That is either $d_{[0]^{t'+j}} \notin \{d_1,\dots, d_t\}$ or  $d_{[0]^{t'+j}}$ is one of $\{d_{t'+1},\dots, d_{t'+j - 1}\}$. The former case means we found a solution while the latter case indicates we have to do the reassignment again for a target in $\{t'+1,\dots, t'+j -1\}$. Observe that, every time we have to do a reassignment, the index of the target for the reassignment is decreased. In the end, it will reach target $t'+1$ for which we can reassign $d_0\notin \{d_1,\dots, d_t\}$ and obtain a feasible solution. 

% there must exist a target $t_{\text{final}}\in \{1,\dots, t'\}$ and a defender $d_{\text{final}}\notin \{d_1,\dots, d_{t'}\}$ such that $\omega(d_{\text{final}}\rightarrow t_{\text{final}}) > 0$. 
% On the other hand, if $\mathbf{D}(j) \subseteq \{d_1, \dots, d_t\}\setminus \{d_{t'+1}\}$ for all targets $j \leq t'$, there must exist $j> t'+1$ and $j \leq t$ and a defender $d\notin\{d_1, d_2,\dots, d_t\}\setminus \{d_{t'+1}\}$ such that $\omega(d, j) > 0$. Indeed, if there is no such $(d, j)$, it means all targets $\{1, \dots, t+1\}\setminus \{t'+1\}$ can be only assigned to one of the defenders in $\{d_1,\dots, d_t\}\setminus \{d_{t'+1}\}$. As a result, we will have:
% \begin{align*}
%     &t = \sum_{j' = 1, j'\neq t'+1}^{t+1}\sum_{d\in\mathbf{D}_{j'}} \omega(d, j') \leq \sum_{j' = 1, j'\neq t'+1}^t\sum_{j''\in\mathbf{T}(d_{j'})}\omega(d_{j'}, j'') = t -1
% \end{align*}
% % $j = \sum_{j = 1, j\neq i'+1}^{i+1}\sum_{m\in\mathbf{D}_j} \omega(m, j) \leq \sum_{j = 1, j\neq i'+1}^i\sum_{j'\in\mathbf{T}(m_j)}\omega(m, j') = i -1$. 
% W.l.o.g we assume that $j = t'+2$. In this case we can do the swap or there exist $j \geq i'+2$ and $j \leq i$ and $m\notin \{m_1, m_2, \dots, d_t\}\setminus \{m_{i'+1}, m_{i'+2}\}$ such that $\omega(m, j) > 0$. We can continue doing this until we obtain the feasible sequence. More specifically, at some point we have
% \begin{itemize}
%     \item $\omega(m, i'+1) >0$ for some $m^1 \notin\{m_1,\dots, d_t\}$
%     \item $\omega(m, i'+2) >0$ for some $m^2 \notin\{m_1,\dots, d_t\}\setminus \{m_{i' + 1}\}$
%     \item $\omega(m, i'+3) >0$ for some $m^3 \notin\{m_1,\dots, d_t\}\setminus \{m_{i' + 1}, m_{i'+2}\}$
%     \item $\omega(m, i'+k) >0$ for some $m^k \notin\{m_1,\dots, d_t\}\setminus \{m_{i' + 1}, \dots, m_{i'+k-1}\}$
% \end{itemize}
% If there some $j\leq i'$ such that $\omega(j, m_{i'+k - 1}) > 0$, we can do the swap.
Based on the result of Lemma~\ref{lemma.2}, we allocate the following non-zero probability to the assignment with $r = 1$:
\begin{align*}
    & p = \min\{\min_d\{\omega(d, t_d)\}, r-\max_{t\notin \{t_1,\dots, t_{\mathbf{D}}\}}\sum\nolimits_d \omega(d,t)\}
\end{align*}
Given this assignment, we update $w(d,t_d) = w(d,t_d) - p$ for all $d$. The resulting coverage vector $\{\omega(d,t)\}$ still satisfies the conditions (\ref{res.1}--\ref{res.2})
with the remaining $r = r - p < 1$. We keep doing this probability allocation until we obtain a feasible signaling scheme (aka, $r$ reaches 0).
% \paragraph{Second, $\sum_m \omega(m, i) < 1$ for all targets $i$.} Without loss of generality, we are also going to prove that there is such $(m_1, 1),\dots, (m_k, k)$ ($k$ is the number of defender resources). We choose $\epsilon =\min\{\omega(m_1, 1),\dots, \omega(m_k, k), 1 -\max_i\sum_m \omega(m, i)\}$. Indeed, let's assume we have $(m_1,1),\dots, (d_t, i)$ for $ i < k$ so far. Let's also consider for all $j > i$ we have $\mathbf{D}(j) \subseteq \{m_1,\dots, d_t\}$; otherwise, we could choose a $j > i$ and $m$ to add into the current sequence. We will do the swap process. First, there must exist $j\leq i$ and $m\notin \{m_1,\dots, d_t\}$ such that $\omega(m, j) > 0$; otherwise, $i = \sum_{j = 1}^i\sum_k \omega(m_j,k) = \sum_{j}\sum_{m} \omega(m, j) = k$ while $i < k$. W.l.o.g that $j = 1$. If there is a $j > i$ such that $m_1\in \mathbf{D}(j)$, then we can do the swap. Otherwise, if $m_1\notin \mathbf{D}_j$ for all $j > i$, there must exist $1 < j \leq i$ and $m \notin \{m_2,\dots, d_t\}$ such that $\omega(m, j) > 0$. Otherwise, if $\mathbf{D}(j) \subseteq \{m_2,\dots, d_t\}$ for all $j \neq 1$, we have: $i - 1= \sum_{j \neq 1} \sum_k \omega(m_j, k) \geq \sum_{j\neq 1}\sum_{m\in\mathbf{D}_j} \omega(m, j)\geq k - 1$ while $ i < k$. We keep continue doing this until we obtain a feasible sequence. 
% Finally, when neither the first case or second case happens, we divides targets into two groups: (i) High-coverage group $\mathcal{H} = \{i: \sum_m \omega(m, i) = 1\}$ and (ii) Low-coverage group $\mathcal{L} = \{i: \sum_m \omega(m, i) < 1\}$. This case is a combination of the above two cases. 
% In the end, we have the general inductive case:
% \begin{align*}
%     &\sum_t \omega(d, t) = r\\
%     & \sum_d \omega(d, t) \leq r
% \end{align*}
% After each induction step, $r$ is strictly decreased until we obtain a mixed strategy of the defender.
\end{proof}
\section{Experiments}
% \thanh{The x-axis of these figures should either the number of targets or number of defenders or number of attacker types.}

In our experiments, we aim at evaluating both the solution quality and runtime performance of our algorithms in various game settings. 
% with different number of targets, defenders, and attacker types, as well as different patrolling cost ranges
All the LPs in our algorithms are solved with the CPLEX solver (version 20.1). We run our algorithms on a machine with an Intel i7-8550U CPU and 15.5GB memory. The rewards and penalties of players are generated uniformly at random between [0, 20]  and [-20, 0], respectively. All data points are averaged over $40$ random games and the error bars represents the standard error.

We compare our private and ex ante signaling schemes with: (i) a \textit{baseline} method in which each defender optimizes his utilities separately by solving a Bayesian Stackelberg equilibrium between that defender and attacker, without considering strategies of other defenders; and (ii) the Nash Stackelberg equilibrium (NSE) among the defenders. We use the method provided in \citep{Jiarui2018Stackelberg} to approximate an NSE.
% We only consider the NSE solution when the defenders have no patrolling costs, i.e., $C^d(t)=0, \forall d, t$ since the method provided in \citep{Jiarui2018Stackelberg} to compute a NSE is only applicable for the no-patrolling-cost setting. 
% We take the expected value of defender utility under NSEs over all attacker types using the algorithm introduced in \citep{Jiarui2018Stackelberg}. 
We evaluate our signaling schemes in two scenarios corresponding to two different objectives of the principal: (i) maximizing the social welfare of the defenders (Figures~\ref{fig:def_util_cost10}--\ref{fig:nocost}); and (ii) maximizing her own defense utility (i.e., the principal is one of the self-interested defenders) (Figure~\ref{fig:opt_def_util_cost10}). Next, we highlight our important results. 

\begin{figure}[t!]
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-cost_range_tar12.png}
  \caption{\small $|\mathbf{D}|=|\Lambda|=4, |\mathbf{T}|=12$} \label{fig:1a}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-tar_num-cost10.png}
  \caption{$|\mathbf{D}|=|\Lambda|=4$} 
  \label{fig:1b}
\end{subfigure}%
\vskip\baselineskip
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-def_num-cost10.png}
  \caption{\small$|\Lambda|=4, |\mathbf{T}|=12$} \label{fig:1c}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-att_num-cost10.png}
  \caption{$|\mathbf{D}|=4, |\mathbf{T}|=12$} \label{fig:1d}
\end{subfigure}%
\caption{Average Defender Social Welfare. The defenders' cost range is fixed to $[0, 10]$ in sub-figures (b), (c), (d).}
\label{fig:def_util_cost10}
\end{figure}

\begin{figure}[t!]
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-cost_range_tar12.png}
  \caption{$|\mathbf{D}|=|\Lambda|=4, |\mathbf{T}|=12$} \label{fig:2a}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-tar_num-cost10.png}
  \caption{$|\mathbf{D}|=|\Lambda|=4$} 
  \label{fig:2b}
\end{subfigure}%
\vskip\baselineskip
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-def_num-cost10.png}
  \caption{$|\Lambda|=4, |\mathbf{T}|=12$} \label{fig:2c}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-att_num-cost10.png}
  \caption{$|\mathbf{D}|=4, |\mathbf{T}|=12$} \label{fig:2d}
\end{subfigure}%
\caption{Average Attacker Utility. The defenders' cost range is fixed to $[0, 10]$ in sub-figures (b), (c), (d).}
\label{fig:att_util_cost10}
\end{figure}
In Figures~\ref{fig:def_util_cost10} and \ref{fig:att_util_cost10}, the x-axis is either the defender's cost range (the defense cost of each defender is randomly generated within this range), or the number of targets, or the number of defenders, or the number of attacker types. The y-axis is either the defender social welfare (Figure~\ref{fig:def_util_cost10}) or the average utility of the attacker (Figure~\ref{fig:def_util_cost10}). Note that in these figures, we do not consider the Nash Stackelberg equilibrium (NSE) among the defenders. This is because the method provided in \citep{Jiarui2018Stackelberg} to approximate an NSE is only applicable for the no-patrolling-cost setting. Figure~\ref{fig:def_util_cost10} shows that signaling schemes ($\mathtt{Private}$ and $\mathtt{Ex Ante}$) helps in significantly increasing the defender social welfare compared to the $\mathtt{Baseline}$ case. In addition, the defender social welfare in $\mathtt{Ex Ante}$ is substantially higher than the $\mathtt{Private}$ case. This result makes sense since the persuasion constraints in $\mathtt{Ex Ante}$ are less restricted. In addition, the social welfare is roughly a decreasing linear function of the cost range and the number of targets while it increases linearly in the number of defenders. This is because the social welfare is a decreasing function of the defenders' coverage probability at each target and the higher the number of defenders is, the more coverage at each target is. Conversely, we see an opposite trend in the attacker graphs (Figure~\ref{fig:att_util_cost10}). 

Furthermore, we include the NSE in our experiments with no defense cost.  Figure~\ref{fig:nocost} shows that despite $\mathtt{NashStackelberg}$ results in a higher social welfare for the defenders compared to $\mathtt{Baseline}$ in which each defender ignores the presence of other defenders, the social welfare in $\mathtt{NashStackelberg}$ is still significantly lower than $\mathtt{Private}$ and $\mathtt{ExAnte}$. The results in Figures~\ref{fig:def_util_cost10},~\ref{fig:att_util_cost10} and~\ref{fig:nocost} clearly show that coordinating the defenders through the principal's signaling schemes helps in significantly enhancing the protection effectiveness on the targets. 

In Figure~\ref{fig:opt_def_util_cost10}, we examine the situation in which the principal attempts to maximize her own utility (given she is one of the self-interested defenders). We again observe that the attacker suffers a significant loss in its utility compared to $\mathtt{Baseline}$ (Figure~\ref{fig:opt_def_util_cost10}(a), $\mathtt{Private}$ and $\mathtt{ExAnte}$ versus $\mathtt{Baseline}$). Conversely, the principal can get a significant benefit for strategically revealing her private information through the signaling mechanisms (Figure~\ref{fig:opt_def_util_cost10}(b)).      

% Figure \ref{fig:def_util_cost10} compares mean defender utility under different models. 

% Figure \ref{fig:att_util_cost10} compares the expected attacker utility under different models. 

% Figure \ref{fig:nocost} compares the performance of private and ex ante signaling schemes to Nash-Stackelberg equilibrium. 

% Figure \ref{fig:opt_def_util_cost10} considers the situation when one of the defenders tries to maximize his own utility and send signals to persuade other defenders. 
\begin{figure}[t!]
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-tar_num-cost0.png}
  \caption{$|\mathbf{D}|=|\Lambda|=4, C^d(t)=0$} \label{fig:3a}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-tar_num-cost0.png}
  \caption{$|\mathbf{D}|=|\Lambda|=4, C^d(t)=0$}
\end{subfigure}%
\caption{All evaluated algorithms, no patrolling costs.}
\label{fig:nocost}
\end{figure}

\begin{figure}{}
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{Figures/log_runtime-tar_num-cost10-optdef.png}
%   \caption{Logarithm runtime vs. number of targets}
% \end{subfigure}%
% ~
\begin{subfigure}[t]{0.45\linewidth}
  \includegraphics[width=\linewidth]{att_util-tar_num-cost10-optdef.png}
  \caption{Attacker utility} 
\end{subfigure}%
% \vskip\baselineskip
~
\begin{subfigure}[t]{0.50\linewidth}
  \includegraphics[width=\linewidth]{optdef_util-tar_num-cost10-optdef.png}
  \caption{Principal utility}
\end{subfigure}%
% ~
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{Figures/def_util-tar_num-cost10-optdef.png}
%   \caption{Defender social welfare}
% \end{subfigure}%
\caption{The principal optimizes her own utility when $|\mathbf{D}|\!=\!|\Lambda|\!=\!4$ and the defenders' cost range $C^d(t) \!\in\! [0, 10]$.}
\label{fig:opt_def_util_cost10}
\end{figure}

\begin{figure}[t!] 
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
\begin{subfigure}[t]{0.50\linewidth}
  \includegraphics[width=\linewidth]{log_runtime-tar_num-cost10.png}
  \caption{$|\mathbf{D}|\!=\!|\Lambda|\!=\!4, C^d\!(t) \!\in\! [0, 10]$} \label{fig:4a}
\end{subfigure}%
~
\begin{subfigure}[t]{0.47\linewidth}
  \includegraphics[width=\linewidth]{log_runtime-tar_num-cost0.png}
  \caption{$|\mathbf{D}|\!=\!|\Lambda|\!=\!4, C^d(t)\!=\!0$}
  \label{fig:4b}
\end{subfigure}%
\caption{Log run time in seconds.}
\label{fig:log_runtime_cost10}
\end{figure}

Figure \ref{fig:log_runtime_cost10} shows the logarithm runtime of our algorithms compared to $\mathtt{Baseline}$ and $\mathtt{NSE}$. We observe that our algorithms ($\mathtt{Private}$ and $\mathtt{ExAnte}$) are suitable for medium games. In Figure~\ref{fig:log_runtime_cost10}(a), it takes $\mathtt{Private}$ and $\mathtt{ExAnte}$ approximately 23 minutes and 40 seconds respectively to solve 20-target games. Furthermore, our compact representation method ($\mathtt{ExAnteCompact}$) helps in solving the signaling scheme significantly faster. It only takes $\mathtt{ExAnteCompact}$ approximately 2.7 seconds to solve 20-target games. 


% \begin{figure}[!htb]
% \captionsetup[subfigure]{format=hang, justification=centering}
% \centering
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{att_util-large_tar_num-cost10.png}
% \end{subfigure}%
% ~
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{def_util-large_tar_num-cost10.png}
% \end{subfigure}%
% \vskip\baselineskip
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{log_runtime-large_tar_num-cost10.png}
% \end{subfigure}%
% \caption{Scalability of target number in ex ante setting when $|\Lambda|=|D|=4$, and the defenders' cost range $C^d(t) \!\in\! [0, 10]$.}
% \label{fig:large_tar}
% \end{figure}


% \begin{figure}[!htb]
% \captionsetup[subfigure]{format=hang, justification=centering}
% \centering
% \begin{subfigure}[t]{0.49\linewidth}
%   \includegraphics[width=\linewidth]{log_runtime-large_att_num-cost10.png}
% \end{subfigure}%
% \caption{Scalability of attacker types in ex ante setting when $|D|=4$, $|\mathbf{T}|=20$, and the defenders' cost range $C^d(t) \!\in\! [0, 10]$.}
% \label{fig:large_att}
% \end{figure}


\begin{figure}[!htb]
\captionsetup[subfigure]{format=hang, justification=centering}
\centering
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{att_util-large_tar_num-cost10.png}
  \caption{$|\Lambda|=20$}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{def_util-large_tar_num-cost10.png}
  \caption{$|\Lambda|=20$}
\end{subfigure}%
\vskip\baselineskip
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{log_runtime-large_tar_num-cost10.png}
  \caption{$|\Lambda|=20$}
\end{subfigure}%
~
\begin{subfigure}[t]{0.49\linewidth}
  \includegraphics[width=\linewidth]{log_runtime-large_att_num-cost10.png}
  \caption{$|T|=20$}
\end{subfigure}%
\caption{Scalability of target number or attacker types in ex ante setting when $|D|=4$ and the defenders' cost range $C^d(t) \!\in\! [0, 10]$. }
\label{fig:large_tar_att}
\end{figure}

Finally, we examine the performance in the ex ante case with large number of targets or attacker types in Figure \ref{fig:large_tar_att}. Our algorithms can easily scale to about 160 targets, which makes large improvement compared to previous works \citep{yin2012unified, nguyen2014stop}. We remark that it is typically impossible to test running time for such complicated security games for more than 200 targets on a single machine (most real world applications such as conservation area protection or border protection have less than 200 targets as well). For large number of attacker types, our experiments show that the running time dependence of our algorithm on the number of attacker types is linear, which is extremely efficiency.


\section{Summary}
In this paper, we study information design in a Bayesian security game setting with multiple independent and self-interested defenders. Our results (both theoretically and empirically) show that information design not only significantly improves protection effectiveness but also leads to efficient computation. In particular, in computing an optimal private signaling scheme, we develop an ellipsoid-based algorithm in which the separation oracle component can be decomposed into a polynomial number of sub-problems, and each sub-problem reduces to a bipartite matching problem. This is a non-trivial task, since the outcomes of private signaling form the set of Bayes correlated equilibria and computing an optimal correlated equilibrium is a fundamental and well-known intractable problem. Our proof is technical and crucially explores the special structure of security games.  Furthermore, we investigate the \emph{ex-ante} private signaling scheme. In this scenario, we develop a novel compact representation for the signaling schemes by compactly characterizing jointly feasible marginals. This finding enables us to significantly reduce the signaling scheme computation compared to the ellipsoid approach (which is efficient in theory but slow in practice).

\begin{acknowledgements} 
 Haifeng Xu is supported by    an NSF grant CCF-2132506; this work is done while Xu is at the University of Virginia. Thanh H. Nguyen is supported by ARO grant W911NF-20-1-0344 from the US Army Research Office. 
\end{acknowledgements}
% \newpage
\bibliography{zhou_196}
% \bibliographystyle{plainnat}
\newpage
\newpage


\end{document}