% \documentclass{uai2025} % for initial submission
\documentclass[accepted]{uai2025} % after acceptance, for a revised version; 
% also before submission to see how the non-anonymous paper would look like 
                        
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2025} % ptmx math instead of Computer
                                         % Modern (has noticeable issues)
% \documentclass[mathfont=newtx]{uai2025} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{subfigure}
\usepackage{multirow}
\usepackage{makecell}
\usepackage{amssymb}
\usepackage{amsthm}

\newtheorem{definition}{Definition}
\newtheorem{corollary}{Corollary}
\newtheorem{theorem}{Theorem}
\newtheorem{proposition}{Proposition}
\newtheorem{lemma}{Lemma}

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\newcommand{\thankstext}{\thanks{Corresponding author.}}
\title{Enhanced Equilibria-Solving via Private Information Pre-Branch Structure in Adversarial Team Games}

% The standard author block has changed for UAI 2025 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1]{Chen~Qiu}
\author[2]{Haobo~Fu}
\author[3,4]{Kai~Li}
\author[1]{Jiajia~Zhang}
\author[1]{Xuan~Wang\thanks{Corresponding author.}}
% Add affiliations after the authors
\affil[1]{%
    School of Computer Science and Technology\\
    Harbin Institute of Technology Shenzhen\\
    China
}
\affil[2]{%
    Tencent AI Lab\\
    China
  }
\affil[3]{%
    Institute of Automation\\
    Chinese Academy of Sciences\\
    China
}
\affil[4]{%
    School of Artificial Intelligence\\
    University of Chinese Academy of Sciences\\
    China
}

  
  \begin{document}
\maketitle

\begin{abstract}
In \emph{ex ante} coordinated adversarial team games (ATGs), a team competes against an adversary, and team members can only coordinate their strategies before the game starts. The team-maxmin equilibrium with correlation (TMECor) is a suitable solution concept for extensive-form sequential ATGs. One class of TMECor-solving methods transforms the problem into solving NE in two-player zero-sum games, leveraging well-established tools for the latter. However, existing methods are fundamentally action-based, resulting in poor generalizability and low solving efficiency due to the exponential growth in the size of the transformed game. To address the above issues, we propose an efficient game transformation method based on private information, where all team members are represented by a single coordinator. We designed a structure called \emph{private information pre-branch}, which makes decisions considering all possible private information from teammates. We prove that the size of the game transformed by our method is exponentially reduced compared to the current state-of-the-art. Moreover, we demonstrate equilibria equivalence. Experimentally, our method achieves a significant speedup of 182.89$\times$ to 694.44$\times$ in scenarios where the current state-of-the-art method can work, such as small-scale \emph{Kuhn poker} and \emph{Leduc poker}. Furthermore, our method is applicable to larger games and those with dynamically changing private information, such as \emph{Goofspiel}.
\end{abstract}

\section{Introduction}\label{intro}
Games have long served as critical testbeds for exploring how effectively machines can make sophisticated decisions since the early days of computing \citep{DBLP:journals/ai/BardFCBLSPDMHDM20, campbell2002deep, silver2017mastering}. Finding equilibrium in games has become a significant criterion for evaluating the level of artificial intelligence. In the real world, there have been systems that have achieved superhuman performance, such as \emph{AlphaGo} \citep{silver2016mastering}, \emph{Libratus} \citep{brown2018superhuman}, and \emph{DeepStack} \citep{moravvcik2017deepstack}. While many advances \citep{brown2019deep, Zhou2020, https://doi.org/10.1002/int.21857} have been made in 2-player zero-sum (2p0s) games based on Nash equilibrium (NE) \citep{nash1951non} in imperfect information environments, recent research has focused on more complex adversarial team games (ATGs). In ATGs, multiple players with the same utility function form a team to compete against a common adversary \citep{von1997team}. This results in a game where both cooperation and competition coexist.

In this paper, we focus on the ATGs with \emph{ex ante} coordination. More specifically, team members are allowed to coordinate and agree on a common strategy before the game starts. The solution concept for this setting is the team-maxmin equilibrium with correlation (TMECor), which can be thought of as an NE between the team and the adversary in an ATG \citep{Zhang_2021}. TMECor has properties similar to NE, such as exchangeability. However, finding a TMECor is proven to be FNP-hard \citep{hansen2008approximability}. 

Methods for computing TMECor can be roughly divided into four categories. The first involves using linear programming (LP). Hybrid column generation was the first algorithm to compute TMECor in ATGs \citep{celli2018computational}. Its core involves team members adopting joint normal-form strategies, while the adversary uses sequence-form strategies. 
However, it requires solving an integer LP and suffers from exponential growth in the number of joint actions as the game size increases, making it impractical for large-scale games.
The second category involves multi-agent deep reinforcement learning algorithms, such as SIMS \citep{DBLP:conf/atal/CacciamaniCC021}, which learns coordinated strategies from experience. It is limited to games with symmetric observations and requires perfect recall refinement, which is not feasible when team members have private information.
The third approach, based on the team belief directed acyclic graph (TBDAG) \citep{pmlr-v202-zhang23j}, captures joint beliefs of team members and constructs a DAG for team strategies, providing a more compact representation of team's strategy space. Unfortunately, applying LP directly to the TB-DAG is inefficient, and subsequent improvements \citep{pmlr-v235-zhang24b, NEURIPS2022_aa5f5e6e} suffer from limited interpretability. 
The last category of methods establishes a connection between ATGs and 2p0s games through game tree transformation. While CFR \citep{zinkevich2007regret} and its variants provide convergence guarantees for finding NE in 2p0s games, no such guarantees exist for multiplayer games \citep{doi:10.1126/science.aay2400}. We can leverage these established algorithms by transforming ATGs into equivalent 2p0s games. In this paper, we focus on the game tree transformation-based approach.


The state-of-the-art game tree transformation-based method, TPICA \citep{carminati2022marriage}, involves the concept of \emph{extensive-form game with visibility}. By introducing the coordinator into an ATG, they convert the task of finding a TMECor into finding an NE in a 2p0s game. However, TPICA suffers from low solving efficiency and limited types of solvable games due to its reliance on action-based transformation. In this method, the coordinator extracts an action from each distinguishable state in the original game to form different recommendations for the team players. Then, a specific action is designated from each recommendation as an available action for the coordinator. To analyze the game size complexity, we consider a setting where the opponent plays first, followed by team members in sequence. We assume that every player has the same number of available actions in every state. Since modifications are made only to team player nodes, we refer to the phase from any specified team player node, which acts first, to all possible next opponent (or terminal) nodes as an episode. For any episode, the size of the transformed game tree is $\mathcal{O}\big((\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}\big)$, where $\lvert A\rvert$, $\lvert \Omega\rvert$, $\lvert \mathcal{T}\rvert$ denote the number of available actions, the amount of private information, and the number of team players, respectively. Therefore, the size of the transformed game tree grows significantly with the increase in the number of available actions, team players, and private information. In particular, the size growth triggered by a single coordinator node is exponential. Additionally, TPICA cannot be applied to games where players' private information changes.



To address the above issues, we propose a multi-player transformation algorithm (MPTA) based on private information. To mitigate the exponential growth caused by a single coordinator, we designed a new structure called \emph{private information pre-branch} (PIPB), consisting of coordinator and dummy player nodes. Specifically, PIPB allows dummy players to provide the coordinator with all possible private information from teammates. Since the amount of potentially private information in an ATG is fixed, this structure significantly reduces the size of the transformed game tree compared to the previous state-of-the-art method. This leads to a substantial improvement in the efficiency of equilibrium computation. Furthermore, we demonstrate the equilibrium equivalence before and after the transformation. The private information-based transformation makes our method suitable for games with dynamically changing private information (e.g., \emph{Goofspiel}), expanding the types of solvable games. We show the superior performance of our method through extensive experiments in different game scenarios. Experimental results show that our method computes strategies closer to TMECor compared to the baseline algorithm in the same runtime and significantly reduces runtime within the same number of iterations.

Our contributions can be summarized as follows:
\begin{itemize}
        \item We proposed MPTA based on private information, which significantly improves equilibria-solving efficiency. For any episode, compared to the previous state-of-the-art, the size growth is reduced. Specifically, our method decreases from $\mathcal{O}\big((\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}\big)$ to $\mathcal{O}\big((\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert)^{\lvert \mathcal{T}\rvert}\big)$, where $\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}$ represents the number of ways to arrange $\lvert \mathcal{T}\rvert-1$ elements from a set of $\lvert \Omega\rvert-1$ private information. Additionally, we demonstrated the equilibria equivalence between TMECor in the original game and NE in the transformed game.
        
        \item  The PIPB structure enhances the generalization capability of our method, allowing it to adapt to a wider variety of games. In particular, it enables our method to effectively address ATGs where the private information available to each player can change dynamically throughout the game.
        % It allows our method to be applied to ATGs where players' private information dynamically changes.
        
        \item We conducted experiments on 14 different configurations across three standard testbeds. The results show a significant improvement in solving efficiency using our method, achieving speedups ranging from 182.89$\times$ to 694.44$\times$ compared to the baseline. We also compared the sizes of the transformed game trees, showing that our method results in much smaller game trees than the baseline. Furthermore, we experimented with larger-scale games and other types of games that were not supported by the baseline.
    \end{itemize}
    All proofs in this paper can be found in Appendix~\ref{appendixA}.

\section{Preliminaries}\label{preli}
\subsection{Extensive-Form Games and Nash Equilibrium}
An extensive-form game (EFG) is the tree-form model of imperfect-information games with sequential interactions \citep{kuhn1950extensive, brown2017safe, https://doi.org/10.1002/int.22450}. The set $A=\cup_{i \in N} A_i$ denotes all the possible actions, where $A_i$ represents a set of available actions of player $i$. $\lvert A\rvert$ is the number of each player's available actions. $H$ is the set of nodes, and $h\in H$ represents the sequence of all actions from the root to node $h$. $ha\sqsubseteq h^{\prime}$ denotes $h$ reaches $h^{\prime}$ by playing an action $a$. The set $Z \subseteq H$ contains all the terminal nodes. For each decision node $h \in H$, the result returned by the function $\mathcal{A}(h)$ is all available actions at node $h$. $\omega_{i}\in \Omega$ denotes player $i$'s private information (e.g., a card in a poker game), and $\lvert \Omega\rvert$ represents the amount of private information in a game. The player who takes an action at node $h$ is returned by function $P(h)$. The utility function $u_{i}(z)$ is the player $i$'s payoff mapped from a terminal node $z \in Z$ to the real $\mathbb{R}$. An information set (infoset) $I_i$ represents imperfect information for player $i$, which means all nodes $h, h^{\prime}$ are indistinguishable to $i$ in $I_i$. The set of infosets for player $i$ is denoted by $\mathcal{I}_{i}$, and the set of all infosets is represented as $\mathcal{I}=\cup_{i\in N}\mathcal{I}_{i}$.

There are two fundamental paradigms for strategy representation \citep{carminati2022marriage,https://doi.org/10.1002/int.21950}. A behavioral strategy $\sigma_{i}$ of player $i\in N$ is a function that assigns a distribution over all the available actions $\mathcal{A}\left(I_i\right)$ to each $I_i$. Another strategy representation is based on the normal-form plan (also referred to as the pure strategy), $\pi_{i} = \times_{I\in \mathcal{I}_{i}} \mathcal{A}(I)$, which is a tuple that specifies one action for each of player $i$'s infosets. A normal-form strategy is the probability distribution of normal-form plans for a player.  A reduced normal-form strategy (a.k.a. mixed strategy) $\mu_{i}\in \Delta\left(\Pi_i\right)$ is obtained from a normal-form strategy by consolidating plans that are differentiated via actions taken in unreachable nodes. Henceforth, we focus on reduced normal-form strategies in this paper. For any player $i\in N$, $\mu_{i}[z]$ (or $\sigma_{i}[z]$) denotes the probability of reaching terminal nodes $z\in Z$ when $i$ follows strategy $\mu_{i}$ (or $\sigma_{i}$). We represent behavioral strategy profiles with $\boldsymbol{\sigma}$ and normal-form strategy profiles with $\boldsymbol{\mu}$. 
We define $\boldsymbol{\sigma}_{-i}$ as strategies of players except for $i$. The expected payoff for player $i$ when he plays strategy $\sigma_{i}$ and all the other players follow strategy $\boldsymbol{\sigma}_{-i}$ is denoted by $u_{i}(\sigma_{i},\boldsymbol{\sigma}_{-i})$. Player $i$'s \emph{best response} to strategy $\sigma_{-i}$, denoted as $BR_{i}(\boldsymbol{\sigma}_{-i})$, is a single strategy that maximizes player $i$'s payoff against strategy $\boldsymbol{\sigma}_{-i}$. Formally, $u_{i}(BR_{i}(\boldsymbol{\sigma}_{-i}), \boldsymbol{\sigma}_{-i})=max_{\sigma_{i}^{\prime}}u_{i}(\sigma_{i}^{\prime},\boldsymbol{\sigma}_{-i})$. NE is a significant solution concept in 2p0s games in which no player can unilaterally change his strategy to obtain more payoff. It captures the idea of stability under rational behavior, where no player has an incentive to deviate the current strategy. An NE $\boldsymbol{\sigma}$ is a strategy profile where all players play the \emph{best response}. Formally, $\boldsymbol{\sigma}$ is an NE if and only if $\forall i\in N, \sigma_{p}\in BR_{i}(\boldsymbol{\sigma}_{-i})$. The \emph{exploitability} $e(\sigma_i)$ of strategy $\sigma_i$ serves as our measurement metric, which measures how much worse $\sigma_i$ does versus $BR_{-i}(\sigma_i)$ (i.e., the \emph{best response} of all other players to $\sigma_{i}$) compared to how an equilibrium strategy $\sigma_i^{*}$ does against $BR_{-i}(\sigma_i^{*})$. This metric provides a quantitative measure of deviation from equilibrium and is widely used in evaluating strategy quality in extensive-form games.


\subsection{Adversarial Team Games and Team-Maxmin Equilibrium with Correlation}

An ATG is an EFG with a set of players $N$, where a team of players competes against an opponent. That is, $N=\mathcal{T}\cup \left\{o\right\}\cup \left\{c\right\}$, where $\mathcal{T}$ represents a team, and $o$ is an opponent. The chance player $c$ simulates exogenous randomness in the game, such as dealing a card from a deck. $\lvert \mathcal{T}\rvert$ denotes the number of team members. The team players share payoffs in ATGs. Formally, $\forall i,j\in \mathcal{T}, u_{i}(z)=u_{j}(z)$. Following the convention of the relevant literature \citep{celli2018computational, Zhang_2021, carminati2022marriage}, we assume \emph{perfect recall}, which means each player remembers information acquired in earlier stages of each infoset.

In this work, we focus on the \emph{ex ante} coordinated setting, where TMECor is a significant solution concept. Specifically, a TMECor is an NE that maximizes the team's payoff when team players are allowed to correlate their strategies and agree on tactics before the game begins. A TMECor can be found via a bi-level optimization program formulated over the normal-form strategy profile of team members:
\begin{equation} \label{equ:TMECor}
\begin{split}
\max_{\mu_{\mathcal{T}}}  \min _{\mu_{o}} & \sum_{z \in Z} \mu_{\mathcal{T}}[z]\mu_{o}[z]u_{\mathcal{T}}(z) \\  \text{s.t.} \hspace{.5cm} & \mu_{\mathcal{T}} \in \Delta(\times_{i\in\mathcal{T}}\Pi_i) \\ &  \mu_{o} \in \Delta\left(\Pi_{o}\right)
\end{split}
\end{equation}


\begin{figure*}[t]
\centering
\includegraphics[width=1\textwidth]{TransformedMPTA.pdf}
\caption{Example of game transformation. ``\textbf{\dots}'' indicates omitted branches. The nodes of a player with the same number are in the same infoset. \textbf{Left:} Original ATG omitting the opponent nodes. \textbf{Right:} Result of transforming the game on the left using MPTA.}
\label{fig:TransformedMPTA}
\end{figure*}

\subsection{Team-Public-Information Representation for Extensive-Form Games}
Since this subsection involves some additional concepts, we provide a detailed example in Appendix~\ref{appendixB2} for a clearer explanation.
An action $a$ is classified as \emph{observable} or \emph{unobservable} depending on whether it can be seen by player $i$ when played by another player. If the actions observable by player $i$ at any pair of nodes are the same, these nodes belong to the same infoset. When all infosets are induced, as discussed above, the game is an extensive-form game with visibility (vEFG), where every player has perfect recall. Extending action visibility to a set of player $\mathcal{P}$ (e.g., a team), an action $a$ is called \emph{public} if it is observable by all players in $\mathcal{P}$; \emph{private} if it can be observed by only some player(s) in $\mathcal{P}$; and \emph{hidden} if it is not observable by all players in $\mathcal{P}$ (in this case, $a$ is played by a player not belonging to $\mathcal{P}$). A public infoset for a set of players $\mathcal{P}$ is defined as a public state $S_{\mathcal{P}}\subset H$, where any two nodes of potentially different players in $\mathcal{P}$ belong to the same public state if the actions that are \emph{public} for $\mathcal{P}$ are the same at these nodes. Clearly, if one node of an infoset $I$ belongs to $S_{\mathcal{P}}$, then all the other nodes of $I$ also belong to $S_{\mathcal{P}}$. Let $\mathcal{S}$ denote the set of all public states. $\mathcal{S}_{\mathcal{P}}(h)$ represents the set of all infosets of players in $\mathcal{P}$ that are in the same public state at node $h$.



Similarly, this paper focuses on public-turn-taking games, where every player knows, at every infoset he plays, the sequence of actions taken by players from the root to that infoset. This indicates that the public states have a specific structure and consist of nodes with histories of the same length for a single player. \citet{carminati2022marriage} proved that any vEFG with perfect recall and timeability has a strategy-equivalent public-turn-taking vEFG. Completely inflated games refer to situations where every team player knows the exact action taken by another team player at any infoset. This can be achieved for a generic vEFG by modifying the visibility of team players' actions, allowing explicit representation of strategy sharing among teammates before the game starts. In the following, we focus on completely inflated vEFGs for the team.

In the game tree transformed by TPICA, every coordinator who represents the team $\mathcal{T}$ plays a \emph{prescription} among all the combinations of possible actions for each information state $I$ belonging to the public team state. In other words, for a public team state $S$, the coordinator issues different recommendations to players for every possible information set associated with $S$. Then, dummy players are used to extract a specific action for each infoset from prescriptions and pass it to the next player. Whenever it is the coordinator's turn to play, all the possible action combinations are listed. Therefore, during the transformation, every added dummy player node corresponds to another node. For any episode, the size of the game tree transformed by TPICA is $\mathcal{O}\big((\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}\big)$.

\section{Method}\label{method}
In this section, we provide a comprehensive introduction to our method. We begin by designing a new structure called \emph{private information pre-branch} (PIPB), which provides the coordinator with all possible private information from teammates. This effectively reduces the size of the transformed game since the amount of potentially private information is fixed in a game. Then, we leverage this structure to propose a multi-player transformation algorithm (MPTA) based on private information. It effectively transforms an ATG into a strategy-equivalent 2p0s game and expands the types of solvable games. Additionally, we provide a proof of equilibrium equivalence. Finally, we demonstrate that the size of the game tree transformed by our method is smaller than that of previous state-of-the-art algorithms. 

\subsection{The Structure of \emph{Private Information Pre-Branch}}

In the game transformation process, we introduce two types of players: the coordinator, representing the team, and the dummy player, who conveys teammates' possible private information to the coordinator. We consider the general case where players' private information is interdependent. For example, when a team member is dealt $J$ from a deck of three cards $J,Q,K$, he can safely infer that his teammates' private information cannot include $J$. We define the PIPB structure as follows.
\begin{definition}\label{def:PIPB}
    Given a completely inflated vEFG $\mathcal{G}$ that satisfies the public-turn-taking property, let $h_{d}$ denote a dummy player node and $H_{t} \subseteq H$ denote a set of nodes of the same team member. $h_{d}$ and $H_{t}$ form a PIPB iff $\forall h\in H_{t}, \forall \omega\in \Omega\setminus \{\omega_{i}\}: h_{d}\omega\sqsubseteq h$, where $i= P(h)$.
\end{definition}

Intuitively, all team member nodes in a PIPB are connected to a single dummy player node, which serves as their common parent node. The dummy player's available actions represent all possible private information of the teammates. These actions are \emph{unobservable} to all players except those in the next layer. Here, the layer refers to the order of player decisions in the transformed game, which can be visualized in the game tree as hierarchical levels. In other words, only players who act immediately after the dummy player (i.e., those in the next layer) can observe the dummy player’s actions, whereas all other players cannot. After the transformation, all team players are replaced by the coordinator. While the visibility of the dummy player's actions remains unchanged, the coordinator's public states will change.




\subsection{Multi-Player Transformation Algorithm}
We propose a multi-player transformation algorithm that utilizes the PIPB structure to transform an ATG into a 2p0s game. This method achieves the equivalent transformation by using potentially private information of teammates. The pseudocode is provided in Algorithm~\ref{algo:MPTA}. Our method relies on the tree-form structure, so we first construct a complete game tree for the original ATG and then traverse it in a depth-first pre-order manner. 

To illustrate more clearly, we provide an example of an ATG, ignoring the opponent, transformed via our method, as shown in Figure~\ref{fig:TransformedMPTA}. The converted result by TPICA can be found in Appendix~\ref{appendixB1}. In this example, the chance player deals one card to each player from a deck of three cards ($J,Q,K$) as their private information, and team members' actions are \emph{public} (the opponent's actions are also \emph{public} if he is not ignored). The actions of the chance player are \emph{private} to team members as each team player knows only their own card. This means every team member node in the original game tree is a separate infoset. As shown in the left part of Figure~\ref{fig:TransformedMPTA}, the set of all private information $\Omega$ is $\{J,Q,K\}$. When traversing to a team member node, a dummy player node is first added. This dummy player's actions represent all possible private information of the teammates, i.e., $\{Q,K\}$ or $\{J,K\}$. The private information of team members cannot be passed along since they are not publicly observable. The dummy player's actions are based on the private information of the specific team member, so they also cannot be passed along and can only be observed by team members at the next level. Then, we introduce coordinator nodes to make decisions in place of the team member nodes. Terminal nodes are copied according to the sequence of publicly observable actions in the original game. In particular, the payoff of a coordinator is the sum of all team members' payoffs.

Compared to TPICA, the PIPB structure in our method reduces the number of infosets in the transformed game. Coordinator nodes that share the same dummy player node belong to the same information set. Furthermore, the partition of the coordinator nodes' infosets is also based on actions observable to all team members (i.e., actions called \emph{public}). We divide the coordinator's public infosets according to the concept of public state, as shown in the right part of Figure~\ref{fig:TransformedMPTA}, where nodes of a player with the same number belong to the same public infoset. Moreover, since our method also models adversarial team games using vEFGs, as TPICA does, it remains compatible with abstraction and pruning techniques. Theorem~\ref{theorem:growth} further shows that the game tree transformed by our method is theoretically smaller in size than that converted by TPICA.
\begin{theorem}\label{theorem:growth}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and its transformed game $G^{\prime}=\emph{MPTA}(G)$. The size of any episode in $G^{\prime}$ is $\mathcal{O}\big((\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert)^{\lvert \mathcal{T}\rvert}\big)$.
\end{theorem}

\begin{algorithm}[tb]
\caption{Multi-Player Transformation Algorithm}
\label{algo:MPTA}
\begin{algorithmic}[1] %[1] enables line numbers
\STATE \textbf{Function} \emph{MPTA}($G$)
% \COMMENT{$G=(N, A, H, Z, \mathcal{A}, P, u, I)$ is original ATG}  
\STATE initialize $G^{\prime}$ 
% \COMMENT{ $\mathcal{G}$ is a new 2-player game}
\STATE $N \gets \mathcal{T}\cup \left\{o\right\}\cup \left\{c\right\}$
\STATE $N^{\prime} \gets \{t\} \cup \{o\} \cup \{c\}$
\STATE initialize $h$ with the chance player node
\STATE $h^{\prime} \gets$ \emph{ProcOfTrans}$(G, G^{\prime}, h)$ 
% \COMMENT{$origNode$ is a node in the original game tree $G$}
\RETURN{$G^{\prime}$}

\STATE \textbf{Function} \emph{ProcOfTrans}($G, G^{\prime}, h$)
\STATE $\Omega \gets$ Private$(G)$  
% \COMMENT{$Private()$ returns the number of different private information}
\IF{$P(h) = c$}
\STATE $h^{\prime} \gets h$
\STATE $\mathcal{A}^{\prime}(h^{\prime}) \gets \mathcal{A}(h)$
% \STATE \emph{ProcOfTrans}($G, G^{\prime}, ha^{\prime}$) \quad $\forall a\in \mathcal{A}^{\prime}(h^{\prime})$
\FOR{$a^{\prime}\in \mathcal{A}^{\prime}(h^{\prime})$}
\STATE \emph{ProcOfTrans}($G, G^{\prime}, ha^{\prime}$)
\ENDFOR
\ELSIF{$P(h)=o$}
\STATE $h^{\prime} \gets h$
\STATE $\mathcal{A}^{\prime}(h^{\prime}) \gets \mathcal{A}(h)$
\ELSIF{$P(h)\in {\mathcal{T}}$}
\STATE add a dummy player node $h_d$ as $h$'s parent node
\STATE $\mathcal{A}^{\prime}(h_d) \gets \Omega\setminus \{\omega_{P(h)}\}$
\FOR{$a^{\prime} \in \mathcal{A}^{\prime}(h_d)$}
\STATE $h^{\prime}\gets h_{d}a^{\prime}$
\FOR{$a\in \mathcal{A}(h)$}
\STATE \emph{ProcOfTrans}($G,G^{\prime},h^{\prime}a$)
\ENDFOR
\ENDFOR
\ELSE
\STATE $z^{\prime} \gets h$
\STATE $u_{t}(z^{\prime}) \gets \sum_{i\in\mathcal{T}}u_{i}(h)$
\STATE $u_{o}^{\prime}(z^{\prime}) \gets -u_{t}(z^{\prime})$
\ENDIF
\RETURN $h^{\prime}$
\end{algorithmic}
\end{algorithm}

% \begin{proof}
%     We assume that each player has $\lvert A\rvert$ available actions at every state in the original game.
%     During the traversal of the original game tree, the dummy player nodes will provide all possible private information from the teammates. Therefore, the number of available actions at the dummy player nodes is $\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}$. The coordinator inherits the team members' actions since they are \emph{public}. When the team in the game consists of two players, the size of any episode in $G^{\prime}$ is given by:
%     \begin{equation*}
%     \begin{aligned}
%         \frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!} +\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert+ \dots + (\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^2\lvert A\rvert^{2}.
%     \end{aligned}
%     \end{equation*}
%     When the team in the game consists of three players, the size of any episode in $G^{\prime}$ is given by:
%     \begin{equation*}
%     \begin{aligned}
%         &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!} +\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{2}\lvert A\rvert \\ &+ \big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^2\lvert A\rvert^{2}+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{3}\lvert A\rvert^{2} \\ &+ \big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{3}\lvert A\rvert^{3}.
%     \end{aligned}
%     \end{equation*}
%     Thus, extending to the general case where the team consists of $\lvert \mathcal{T}\rvert$ players, the size of any episode in $G^{\prime}$ is given by:
%     \begin{equation*}
%         \sum_{n=1}^{\lvert \mathcal{T}\rvert}\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{n}(\lvert A\rvert^{n-1}+\lvert A\rvert^{n}).
%     \end{equation*}
%     Let $ S_{1}=\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{n}\lvert A\rvert^{n-1}$, $S_{2}=\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{n}\lvert A\rvert^{n}$.
%     Then, we have
%     \begin{equation*}
%         \sum_{n=1}^{\lvert \mathcal{T}\rvert}\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{n}(\lvert A\rvert^{n-1}+\lvert A\rvert^{n})=S_{1}+S_{2}.
%     \end{equation*}
%     First, consider $S_{1}$:
%     \begin{equation*}
%     \begin{aligned}
%         S_{1} = &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{2}\lvert A\rvert+\dots \\ &+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{\lvert \mathcal{T}\rvert}\lvert A\rvert^{\lvert \mathcal{T}\rvert-1}.
%         \end{aligned}
%     \end{equation*}
%     $S_{1}$ meets the criteria for a finite geometric series. Using the geometric series sum formula, we have
%     \begin{equation*}
%         S_{1}=\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1}\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}.
%     \end{equation*}
%     Similarly, consider $S_{2}$:
%     \begin{equation*}
%     \begin{aligned}
%         S_{2}= &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert + \big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{2}\lvert A\rvert^{2}+\dots \\ & +\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{\lvert \mathcal{T}\rvert} +\lvert A\rvert^{\lvert \mathcal{T}\rvert}.
%         \end{aligned}
%     \end{equation*}
%     Thus, 
%     \begin{equation*}
%         S_{2}=\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1} \frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert.
%     \end{equation*}
%     Adding $S_{1}$ and $S_{2}$, we have
%     \begin{equation*}
%         S_{1}+S_{2}=(1+\lvert A\rvert)\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1}.
%     \end{equation*}
%     Therefore, the size of any episode in $G^{\prime}$ is $\mathcal{O}\big((\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert)^{\lvert \mathcal{T}\rvert}\big)$.
%     This concludes the proof.
% \end{proof}

% Let the game transformed by TPICA be denoted as $\hat{G}$. In $\hat{G}$, each team member node, except for the team member node who first acts, corresponds to an additional dummy player node. We recognize that, according to the \emph{prescription} property in TPICA, every coordinator node except for the first one will be matched with a dummy player node. By applying a derivation process similar to Theorem 1, the size of any episode in $\hat{G}$ is $2\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\lvert A\rvert^{\lvert \Omega\rvert})^{n}$. Using the geometric series sum formula, the above equation is equal to $2\lvert A\rvert^{\lvert \Omega\rvert}\frac{(\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert^{\lvert \Omega\rvert}-1}$ (i.e., $\mathcal{O}\big((\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}\big)$). Then, we compare the bases of the two results (i.e., $\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert$ and $\lvert A\rvert^{\lvert \Omega\rvert}$). Clearly, the exponential growth rate of the latter far exceeds the growth rate of the former.



\subsection{Equilibrium Equivalence}
As an important theoretical guarantee for this work, we prove that the TMECor in the original game can be obtained by solving the NE in the transformed game. We call two strategies realization-equivalent if they induce the same probabilities for reaching nodes for all strategies of other players. In other words, the strategies of NE in the transformed game and the strategies of TMECor in the original game are realization-equivalent. We state Lemma~\ref{lemma:strategy} on strategy equivalence as follows.

\setlength{\tabcolsep}{1mm}
\begin{table*}[ht]
  \centering
  \caption{Experimental results of the running time of TPICA and our method on several different types and sizes of game instances. Blank cells indicate that the experiment cannot be conducted.}
    \begin{tabular}{lrrrrrrrrrr}
    \toprule
    \multicolumn{1}{c}{\multirow{2}[2]{*}{\makecell[c]{Game \\ instances}}} & \multicolumn{3}{c}{Total nodes} & \multicolumn{2}{c}{Team nodes} & \multicolumn{2}{c}{Adversary nodes} & \multicolumn{2}{c}{Runtime} & \multicolumn{1}{c}{\multirow{2}[2]{*}{Improvement}} \\
         \multicolumn{1}{c|}{} & \multicolumn{1}{c}{Original} & \multicolumn{1}{c}{TPICA} & \multicolumn{1}{c|}{\textbf{MPTA}} & \multicolumn{1}{c}{TPICA} & \multicolumn{1}{c|}{\textbf{MPTA}} & \multicolumn{1}{c}{TPICA} & \multicolumn{1}{c|}{\textbf{MPTA}} & \multicolumn{1}{c}{TPICA} & \multicolumn{1}{c}{\textbf{MPTA}} &  \\
    \midrule
    12\textbf{K}3  & 151   & 5,395 & 583   & 300   & 144   & 294   & 72    & 139s  & \textbf{0.76s} & \textbf{182.89$\times$} \\
    12\textbf{K}4  & 601   & 1,337,051 & 3,097 & 3,888 & 768   & 4,632 & 384   & 1,560s & \textbf{9.26s} & \textbf{168.47$\times$} \\
    12\textbf{K}6  & 3,001 & 34,191,721 & 23,161 & 261,360 & 5,760 & 368,760 & 2,880 & $>$27h  & \textbf{144s} & \textbf{694.44$\times$} \\
    13\textbf{K}6  & 23,401 &       & 271,441 &       & 75,240 &       & 22,680 &       & \textbf{562s} &  \\
    13\textbf{K}8  & 109,201 &       & 1,713,601 &       & 475,440 &       & 142,800 &       & \textbf{5,093s} &  \\
    14\textbf{K}6  & 115,921 &       & 1,796,401 &       & 528,480 &       & 105,120 &       & \textbf{3,051s} &  \\
    12\textbf{L}33 & 13,183 & 10,777,963 & 57,799 & 614,172 & 14,664 & 475,566 & 6,864 & 56,156s & \textbf{240s} & \textbf{233.98$\times$} \\
    12\textbf{L}43 & 42,589 &       & 251,749 &       & 64,008 &       & 29,736 &       & \textbf{3,006s} &  \\
    12\textbf{L}63 & 218,011 &       & 1,954,351 &       & 497,940 &       & 229,620 &       & \textbf{9,024s} &  \\
    13\textbf{L}33 & 161,491 &       & 948,151 &       & 262,500 &       & 80,220 &       & \textbf{4,014s} &  \\
    13\textbf{L}43 & 738,241 &       & 5,994,241 &       & 1,661,760 &       & 504,000 &       & \textbf{137,817s} &  \\
    14\textbf{L}33 & 1,673,311 &       & 12,226,231 &       & 3,535,320 &       & 809,880 &       & \textbf{143,475s} &  \\
    12\textbf{G}   & 2,509 &       & 92,581 &       & 29,700 &       & 1,464 &       & \textbf{241s} &  \\
    13\textbf{G}   & 15,307 &       & 3,352,669 &       & 1,107,162 &       & 13,128 &       & \textbf{50,107s} &  \\
    \bottomrule
    \end{tabular}
  \label{tab_res}
\end{table*}

\begin{lemma}\label{lemma:strategy}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and the transformed game $G^{\prime}=\emph{MPTA}(G)$. For any joint reduced pure strategy $\pi_{\mathcal{T}}$ in $G$, it can be mapped to a corresponding strategy $\pi_{t}$ in $G^{\prime}$, and vice versa.
\end{lemma}

% \begin{proof}
%     We can prove Lemma~\ref{lemma:strategy} by recursively traversing $G$ and $G^{\prime}$ in a depth-first pre-order manner.
%     \paragraph{Case 1: from $\pi_{\mathcal{T}}$ to $\pi_{t}$.} Let $h$ and $h^{\prime}$ denote the current nodes reached in $G$ and $G^{\prime}$, respectively. $G$ satisfies the public-turn-taking, ensuring $h$ and $h^{\prime}$ either represent the same player or are both terminal nodes. Initializing $h$ with the chance player node. $\Omega$ is a set of all private information in $G$. When constructing $\pi_{t}$:


%     \begin{itemize}
%         \item[1)] For the chance node: $\pi_{c}=\pi_{c^{\prime}}$ always holds as our algorithm does not modify the chance node. This ensures that the actions specified by the chance player's strategy in the original game are the same as those in the transformed game.
        
%         \item[2)] For opponent nodes: The proof is identical to 1).
        
%         \item[3)] For team member nodes: Let $\pi_{\mathcal{T}}[I(h)]$ represent the joint reduced pure strategy of $\pi_{\mathcal{T}}$ at infoset $I(h)$. During the traversal, our algorithm expands $h$ based on all the possibly private information of teammates into $\lvert \Omega\rvert-1$ nodes. These nodes belong to the same infoset, denoted as $I^{\prime}$. Let $\pi_{t}[I^{\prime}]$ represent the reduced pure strategy of $\pi_{t}$ at $I^{\prime}$, where $I^{\prime}\in S_{t}(h)$. When $t$ is in a public state, $\pi_{\mathcal{T}}[I(h)]=\pi_{t}(I^{\prime})$.

%         \item[4)] For terminal nodes: When reaching the terminal node through the above process, our algorithm ensures the following holds: $u_t^{\prime}=\sum_{i\in \mathcal{T}}u_i(h)$ and $u_{o}^{\prime}(h^{\prime}) =u_o(h)=-\sum_{i\in \mathcal{T}}u_i(h)$.
%     \end{itemize}
%     \paragraph{Case 2: from $\pi_{t}$ to $\pi_{\mathcal{T}}$.} The proof follows the same points as the previous case.
% \end{proof}


Leveraging Lemma~\ref{lemma:strategy}, we present Theorem~\ref{theorem:payoff}, which demonstrates the equivalence of payoffs.



\begin{theorem} \label{theorem:payoff}
    Given a public-turn-taking ATG $G$ with visibility. When transformed into the game $G^{\prime}=\emph{MPTA}(G)$, players following the same strategy profile in both games have equivalent payoffs.
\end{theorem}

% \begin{proof}
%     The proof directly relies on Lemma~\ref{lemma:strategy}. Specifically, for any strategy $\pi_{\mathcal{T}}$ in $G$, we can find a corresponding strategy $\pi_{t}$ in $G^{\prime}$ that yields the same payoff. Similarly, for any strategy $\pi_{t}$ in $G^{\prime}$, there exists a payoff-equivalent strategy $\pi_{\mathcal{T}}$ in $G$. This ensures that the payoff for the players remains unchanged whether they choose $\pi_{t}$ in $G^{\prime}$ or $\pi_{\mathcal{T}}$ in $G$.
% \end{proof}


For brevity, we use the notation `$\mapsto$' to denote the strategy mapping relationship. If we transform each pure plan and sum their probability masses, we obtain the corresponding mixed strategy. Formally, for any $\mu_{\mathcal{T}}$, the corresponding mixed strategy is $\sum_{\pi_{\mathcal{T}}:\pi_{t} \mapsto \pi_{\mathcal{T}}}\mu_{\mathcal{T}}(\pi_{\mathcal{T}})$. Therefore, Lemma~\ref{lemma:strategy} also applies to mixed strategies. Specifically, any $\mu_{\mathcal{T}}$ in $G$ can be mapped to $\mu_{t}$ in $G^{\prime}$, and vice versa.

\begin{theorem} \label{theorem:equilibrium}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and its transformed game $G^{\prime}=\emph{MPTA}(G)$. If $(\mu_{t}^{*},\mu_{o}^{*})$ is an NE in $G^{\prime}$, then strategy $(\mu_{T}^{*},\mu_{o}^{*}): \mu_{t}^{*}\mapsto \mu_{\mathcal{T}}^{*}$ is a TMECor in $G$.
\end{theorem}

% \begin{proof}
%     For brevity, let $u_{t}$ and $u_{\mathcal{T}}$ represent $u_{t}(\pi_{c},\pi_{o},\pi_{t})$ and $u_{\mathcal{T}}(\pi_c, \pi_o,\pi_{\mathcal{T}})$, respectively. If $\mu_{t}^{*}$ is an NE in $G^{\prime}$, then by the definition of NE, the following holds:
%     \begin{equation*}
%         \mu_{t}^{*} \in \arg\max_{\mu_{t}} \min_{\mu_{o}} \sum_{\substack{\pi_c \in \Pi_c \\ \pi_o\in \Pi_o\\ \pi_t \in \Pi_t}}\mu_{c}(\pi_{c})\mu_{o}(\pi_{o})\mu_{t}(\pi_{t})u_{t}.
%     \end{equation*}
%     If $\mu_{\mathcal{T}}^{*}$ is a TMECor, it satisfies:
%     \begin{equation*}
%         \mu_{\mathcal{T}}^{*} \in \arg\max_{\mu_{\mathcal{T}}}\min_{\mu_{o}}\sum_{\substack{\pi_{c}\in \Pi_{c} \\ \pi_{o}\in \Pi_{o} \\ \pi_{\mathcal{T} \in \Pi_{\mathcal{T} }}}} \mu_{c}(\pi_{c}) \mu_{o}(\pi_{o}) \mu_{\mathcal{T}}(\pi_{\mathcal{T}}) u_{\mathcal{T}}.
%     \end{equation*}
%     Let $\min_{TMECor}(\mu_{\mathcal{T}})$ and $\min_{NE}(\mu_t)$ denote the inner minimization problems for TMECor and NE, respectively.

%     Assume there exists a strategy $\mu_{\mathcal{T}}^{\prime}$ such that its value under the definition of TMECor is greater than that of $\mu_{\mathcal{T}}^{*}$. It means that $\min_{TMECor}(\mu_{\mathcal{T}}^{\prime}) > \min_{TMECor}(\mu_{\mathcal{T}}^{*})$.
%     According to Lemma~\ref{lemma:strategy}, there exists a strategy $\mu_{t}^{\prime}$ such that $\mu_{\mathcal{T}}^{\prime}\mapsto \mu_{t}^{\prime}$. From Theorem~\ref{theorem:payoff}, we then have:
%     \begin{equation*}
%         \min_{TMECor}(\mu_{\mathcal{T}}^{\prime})=\min_{NE}(\mu_{t}^{\prime}) > \min_{NE}(\mu_{t}^{*}).
%     \end{equation*}
%     This results in a contradiction, as it implies that $\mu_{t}^{*}$ is not an NE in $G^{\prime}$. Therefore, $\mu_{\mathcal{T}}^{*}$ must be a TMECor in $G$.
% \end{proof}

\section{Experimental Evaluation}\label{experiment}

\subsection{Experimental Setting}
\label{Exp_set}

We conduct experiments on the standard testbed for ATGs. More specifically, we use three different multi-player parametric versions of games: \emph{Kuhn poker} \citep{farina2018ex, kuhn1950simplified}, \emph{Leduc poker} \citep{farina2018ex, DBLP:conf/uai/SoutheyBLPBBR05} and \emph{Goofspiel} \citep{farina2021connecting, ross1971goofspiel}, as they are commonly used for experimental evaluation \citep{farina2021connecting}. Specifically, unlike the other two scenarios, \emph{Goofspiel} involves changes in the amount of players' private information during the game. The number of players in these games is parameterized for flexibility. The specific rules for these games are provided in Appendix~\ref{appendixC}.

We denote the number of opponents by $m$ and the number of team members by $n$. For brevity, we use the following symbols to describe the parameters of the experiments:
\begin{itemize}
    \item \textbf{$mn$K$r$}: \emph{Kuhn poker} with $r$ ranks;
    \item \textbf{$mn$L$rc$}: \emph{Leduc poker} with $r$ ranks and $c$ indistinguishable suits. The default maximum number of bets allowed per betting round is $1$;
    \item \textbf{$mn$G}: \emph{Goofspiel} with three ranks.
\end{itemize}





In this work, we adopt the state-of-the-art method that can be combined with 2-player game algorithms as our baseline. Specifically, we use the TPICA proposed by \citet{carminati2022marriage} as the benchmark for comparison. Since TPICA is not open-source, we reproduced it based on the descriptions and details provided in their paper. Note that in our experimental setup, the opponent acts first. To ensure a rigorous and fair comparison, we employ the \emph{counterfactual regret minimization plus} \citep{Tammelin2015cfrplus}, a well-established algorithm for finding NE in 2p0s games, in both our method and the baseline. All experiments are run on a machine with 18-core 2.7GHz CPU and 250GB memory.



\subsection{Experimental Results}
\label{Exp_res}

\begin{figure*}[ht]
\centering
\subfigure[12\textbf{K}3]{
\label{res2_1}
\includegraphics[width=0.245\textwidth]{fig5.pdf}}
\subfigure[12\textbf{L}33]{
\label{res2_2}
\includegraphics[width=0.245\textwidth]{fig6.pdf}}
\subfigure[12\textbf{K}6]{
\label{res2_3}
\includegraphics[width=0.245\textwidth]{fig7.pdf}}
\subfigure[14\textbf{K}6]{
\label{res2_4}
\includegraphics[width=0.245\textwidth]{fig8.pdf}}
\subfigure[14\textbf{L}33]{
\label{res2_5}
\includegraphics[width=0.245\textwidth]{fig9.pdf}}
\subfigure[12\textbf{G} and 13\textbf{G}]{
\label{res2_6}
\includegraphics[width=0.245\textwidth]{fig10.pdf}}
\caption{Comparison of exploitability in the same running time. All experiments except 12G run for 100,000 seconds. TPICA fails to work due to out-of-memory in 14\textbf{K}6 and 14\textbf{L}33 and cannot run on Goofspiel due to changes in private information.}  
\label{fig:time_expl}
\end{figure*}

\begin{figure}[ht]
\centering
\subfigure[12\textbf{K}3]{
\label{res1_1}
\includegraphics[width=0.22\textwidth]{fig1.pdf}}
\subfigure[12\textbf{K}4]{
\label{res1_2}
\includegraphics[width=0.22\textwidth]{fig2.pdf}}
\subfigure[12\textbf{K}6]{
\label{res1_3}
\includegraphics[width=0.22\textwidth]{fig3.pdf}}
\subfigure[12\textbf{L}33]{
\label{res1_4}
\includegraphics[width=0.22\textwidth]{fig4.pdf}}
\caption{Comparison of runtime within the same number of iterations. All schemes except for 12\textbf{K}6 have been iterated for 20,000 rounds, as the TPICA is too time-consuming to run more rounds.}  
\label{fig:it_time}
\end{figure}

In Table \ref{tab_res}, we use the number of total nodes to represent the size of different games, where the column `Original' represents the scale of the original ATG, and the other columns represent the game size after being transformed by TPICA and our method, respectively. Furthermore, we also provide specific data for the coordinator and adversary nodes. The running time is provided in the column `Runtime'. The TPICA and MPTA algorithms are run for comparison under the same machine configuration and identical experimental conditions. In the four cases of 12\textbf{K}3, 12\textbf{K}4, 12\textbf{K}6 and 12\textbf{L}33 where both MPTA and TPICA are applicable, the total time required by our approach to compute a TMECor is $0.76s$, $9.26s$, $144s$, and $240s$ respectively, which are $182.89 \times$, $168.47 \times$, $694.44 \times$ and $233.98 \times$ faster than TPICA. These results show that our method effectively reduces the game size, significantly improving solving speed by several orders of magnitude. It is worth noting that, in certain large-scale game scenarios where TPICA is unable to transform the original game tree, MPTA can still effectively support the computation of TMECor. In particular, 14\textbf{K}6 and 14\textbf{L}33 are 5-player cases that have never been used as experiments by previous algorithms due to their sheer size. We also observe by the node data in Table \ref{tab_res} that the reason for the speed-up is mainly due to the PIPB structure, which significantly reduces the number of adversary nodes and temporary chance nodes. Furthermore, we conduct detailed analyses on the solving efficiency and execution efficiency of the algorithms.



\textbf{Solving efficiency.}
A smaller exploitability value indicates that the current strategy profile is closer to the TMECor. To evaluate the efficiency of the solving process, we conducted a series of tests on the change in exploitability over time within a constrained runtime of $10^5$ seconds, applied to seven different game instances, including \emph{Kuhn poker}, \emph{Leduc poker} and \emph{Goofspiel}, as shown in Figure \ref{fig:time_expl}. Figures \ref{res2_1}, \ref{res2_2}, and \ref{res2_3} show that MPTA consistently outperforms TPICA. This demonstrates that our method delivers higher computational accuracy within the same runtime.  As the complexity of the game instances increases, the performance gap between MPTA and TPICA becomes more pronounced, underscoring MPTA’s greater capability in handling larger-scale scenarios. For example, TPICA fails due to out-of-memory in the cases of 14\textbf{K}6 and 14\textbf{L}33, as shown in Figures \ref{res2_4} and \ref{res2_5}. Although MPTA has not yet fully converged to an approximate equilibrium within the given runtime of $10^5$ seconds, it has the potential to do so with more computation time. Figure \ref{res2_6} highlights MPTA's robust performance in the \emph{Goofspiel} game, which involves dynamic changes in players' private information. TPICA's reliance on fixed \emph{prescriptions}, which specify an action for each infoset, makes it unsuitable for such games, demonstrating the superior generalizability of our approach.




\textbf{Execution efficiency.}
The process of finding a TMECor is inherently iterative. To evaluate the efficiency of algorithm execution, we conducted comparative experiments measuring the time taken to complete the same number of iteration rounds across four distinct scenarios: 12\textbf{K}3, 12\textbf{K}4, 12\textbf{K}6, and 12\textbf{L}33. As illustrated in Figure \ref{fig:it_time}, MPTA consistently requires less time than TPICA to compute approximate equilibrium strategy profiles. This difference in execution time is particularly evident in Figures \ref{res1_1} and \ref{res1_2}, where our method completes 20,000 iterations in 1,740 seconds for the 12\textbf{K}3 scenario and 2,293 seconds for the 12\textbf{K}4 scenario. In contrast, TPICA requires a much longer 5,878 seconds and 89,337 seconds to complete the same iterations under identical conditions, resulting in substantial speed improvements of 182.89 and 168.47 times, respectively. 

The execution efficiency of MPTA becomes even more pronounced as the game scale increases, which is clearly demonstrated in larger, more complex scenarios like 12\textbf{K}6 and 12\textbf{L}33. Figure~\ref{res1_3} shows that MPTA completes 300 rounds in just 79 seconds in the 12\textbf{K}6 scenario, while TPICA struggles, taking a staggering 99,600 seconds to complete only 184 rounds. Similarly, Figure~\ref{res1_4} shows that MPTA completes 15,600 iterations in 3,555 seconds in the 12\textbf{L}33 scenario, whereas TPICA requires an enormous 1,327,652 seconds to achieve the same number of iterations. This translates to remarkable speed improvements of 694.44 and 233.98 times, respectively. These results emphasize the dramatic enhancement in execution efficiency provided by our method, particularly in scenarios involving larger strategy spaces and greater computational complexity.

% Notably, the scales of 12\textbf{K}6 and 12\textbf{L}33 are significantly larger. Figures \ref{res1_3} and \ref{res1_4} show that in these scenarios, MPTA's advantages are even more prominent. In 12\textbf{K}6, MPTA takes 79 seconds for 300 rounds, but TPICA takes 99,600 seconds for 184 rounds. In 12\textbf{L}33, MPTA and TPICA take 3,555 seconds and 1,327,652 seconds for 15,600 iterations, respectively, showing speed improvements of 694.44 and 233.98 times. These results highlight the notable enhancement in the execution efficiency of our method. 





\section{Related Work}
Significant research has been focused on finding suitable solutions for ATGs since the concept of team-maxmin equilibrium was introduced by \citet{von1997team}. According to the communication capabilities of the team members, \citet{celli2018computational} defined three different scenarios and corresponding equilibria for the first time in the extensive-form ATGs.

\citet{basilico2017computing} proposed a modified version of the quasi-polynomial time algorithm and a novel anytime approximation algorithm named \emph{IteratedLP}, whose working principle is to maintain the current solution, providing a policy that can be returned at any time for each team member. \citet{farina2018ex} adopted a novel realization-form representation that maps the problem of finding an optimal ex-ante-coordinated policy for the team to the problem of finding NE, significantly simplifying the analysis of team coordination in adversarial settings. \citet{zhang2020computing} investigated the computational inefficiency resulting from the correlation between team members' strategies and proposed an associated recursive asynchronous multiparametric disaggregation technique to accelerate the computation of TMECor. They accomplished this by reducing the solution space of a mixed integer linear program using an association constraint. Successively, \citet{zhang2020converging, Zhang_2021, farina2021connecting, zhang2022subgame} proposed more efficient variants of the LP. Although \citet{zhang2022team} used a tree decomposition for constraints and described the team's strategy space by a polytope, finding TMECor still requires solving an LP, which remains computationally expensive in large-scale games. Some researchers have attempted to use multi-agent deep reinforcement learning to handle ATGs. For instance, \citet{DBLP:conf/atal/CacciamaniCC021} added a game-theoretic centralized training regimen and served as a buffer of past experiences. However, this method can only be applied to games where team members have symmetric observations of each other. It cannot be extended to general games with private information and public actions, such as poker.



The idea of using a coordinator to facilitate the coordination between team members can be traced back to the seminal work of decentralized stochastic control \citep{nayyar2013decentralized}. The TPICA proposed by \citet{carminati2022marriage} is closely related to our method, providing strong theoretical guarantees for finding equilibrium strategies in ATGs. However, TPICA's action-based game transformation method leads to an exponential growth in game size, which severely limits its scalability in practical applications. In contrast, our method not only offers the same theoretical guarantees but also significantly reduces game size, greatly improving the efficiency of computing TMECor. Moreover, our method expands the types of solvable games, primarily due to the designed PIPB structure.

\section{Conclusion and Discussion}\label{conclusion}
In this paper, we propose a multi-player transformation algorithm that establishes a connection between 2p0s games and ATGs. Our method effectively restricts the exponential growth of the transformed game's action space by leveraging the PIPB structure, allowing it to efficiently handle situations where private information changes during the game. Furthermore, we prove the equilibrium equivalence between the original and transformed games, providing a solid theoretical foundation for the validity of our approach. Through 14 experiments conducted on multiple standard testbeds, we consistently observed exceptional performance, further demonstrating the practical effectiveness and robustness of our method. 

Our approach has potential real-world applications, such as modeling team members' inability to communicate in environmental protection efforts across different regions. In future work, we plan to expand on our method by exploring equilibrium refinement techniques.

\begin{acknowledgements} % will be removed in pdf for initial submission,
						 % (without ‘accepted’ option in \documentclass)
                         % so you can already fill it to test with the
                         % ‘accepted’ class option
    We thank the anonymous reviewers for their valuable comments. This work was supported by National Natural Science Foundation of China (62376073), Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies (2022B1212010005), Natural Science Foundation Project of Guangdong Province (2024A1515030024), and Shenzhen Science and Technology Program (KJZD20230923114213027).
\end{acknowledgements}

% In competitive games, team members might be unable to communicate due to game rules, yet they still need to cooperate effectively. In future work, we plan to expand on our method by exploring equilibrium refinement techniques.







% \begin{contributions} % will be removed in pdf for initial submission 
% 					  % (without ‘accepted’ option in \documentclass)
%                       % so you can already fill it to test with the
%                       % ‘accepted’ class option
%     Briefly list author contributions. 
%     This is a nice way of making clear who did what and to give proper credit.
%     This section is optional.

%     H.~Q.~Bovik conceived the idea and wrote the paper.
%     Coauthor One created the code.
%     Coauthor Two created the figures.
% \end{contributions}

% \begin{acknowledgements} % will be removed in pdf for initial submission,
% 						 % (without ‘accepted’ option in \documentclass)
%                          % so you can already fill it to test with the
%                          % ‘accepted’ class option
%     Briefly acknowledge people and organizations here.

%     \emph{All} acknowledgements go in this section.
% \end{acknowledgements}

% References
\bibliography{uai2025-template}

\newpage

\onecolumn

\title{Enhanced Equilibria-Solving via Private Information Pre-Branch Structure in Adversarial Team Games\\(Appendix)}
% \title{Appendix}
\makeatletter
\renewcommand{\@thanks}{}
\makeatother

\maketitle



% This Supplementary Material should be submitted together with the main paper.

\appendix
% \section*{Appendix}
\section{Proofs}
\label{appendixA}
\subsection{The Proof of Theorem 1}
\setcounter{theorem}{0}  % 将定理计数器重置为 0
\setcounter{lemma}{0}

\begin{theorem}\label{theorem:growth}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and its transformed game $G^{\prime}=\emph{MPTA}(G)$. The size of any episode in $G^{\prime}$ is $\mathcal{O}\big((\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert)^{\lvert \mathcal{T}\rvert}\big)$.
\end{theorem}
\begin{proof}
    We assume that each player has $\lvert A\rvert$ available actions at every state in the original game.
    During the traversal of the original game tree, the dummy player nodes will provide all possible private information from the teammates. Therefore, the number of available actions at the dummy player nodes is $\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}$. The coordinator inherits the team members' actions since they are \emph{public}. When the team in the game consists of two players, the size of any episode in $G^{\prime}$ is given by:
    \begin{equation*}
    \begin{aligned}
        &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!} +\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert+(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^2\lvert A\rvert \\ &+ (\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^2\lvert A\rvert^{2}
    \end{aligned}
    \end{equation*}
    When the team in the game consists of three players, the size of any episode in $G^{\prime}$ is given by:
    \begin{equation*}
    \begin{aligned}
        &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!} +\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert+(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{2}\lvert A\rvert \\ &+ (\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^2\lvert A\rvert^{2}+(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{3}\lvert A\rvert^{2} \\ &+ (\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{3}\lvert A\rvert^{3}
    \end{aligned}
    \end{equation*}
    Thus, extending to the general case where the team consists of $\lvert \mathcal{T}\rvert$ players, the size of any episode in $G^{\prime}$ is given by:
    \begin{equation*}
        \sum_{n=1}^{\lvert \mathcal{T}\rvert}\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{n}(\lvert A\rvert^{n-1}+\lvert A\rvert^{n}).
    \end{equation*}
    Let $S_{1}=\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{n}\lvert A\rvert^{n-1}$, and $S_{2}=\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{n}\lvert A\rvert^{n}$. Then, we have:
    \begin{equation*}
        \sum_{n=1}^{\lvert \mathcal{T}\rvert}\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{n}(\lvert A\rvert^{n-1}+\lvert A\rvert^{n})=S_{1}+S_{2}.
    \end{equation*}
    First, consider $S_{1}$:
    \begin{equation*}
    \begin{aligned}
        S_{1} = &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{2}\lvert A\rvert+\dots \\ &+\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{\lvert \mathcal{T}\rvert}\lvert A\rvert^{\lvert \mathcal{T}\rvert-1}.
        \end{aligned}
    \end{equation*}
    $S_{1}$ meets the criteria for a finite geometric series. Using the geometric series sum formula, we have:
    \begin{equation*}
        S_{1}=\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1}\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}
    \end{equation*}
    Similarly, consider $S_{2}$:
    \begin{equation*}
    \begin{aligned}
        S_{2}= &\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert + \big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{2}\lvert A\rvert^{2}+\dots \\ & +\big(\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\big)^{\lvert \mathcal{T}\rvert} +\lvert A\rvert^{\lvert \mathcal{T}\rvert}.
        \end{aligned}
    \end{equation*}
    Thus, 
    \begin{equation*}
        S_{2}=\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1} \frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert.
    \end{equation*}
    Adding $S_{1}$ and $S_{2}$, we have:
    \begin{equation*}
        S_{1}+S_{2}=(1+\lvert A\rvert)\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\frac{(\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}-1}.
    \end{equation*}
    Therefore, the size of any episode in $G^{\prime}$ is $\mathcal{O}\big((\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert)^{\lvert \mathcal{T}\rvert}\big)$.
    
    This concludes the proof.
\end{proof}

Let the game transformed by TPICA be denoted as $\hat{G}$. In $\hat{G}$, each team member node, except for the team member node who first acts, corresponds to an additional dummy player node. We recognize that, according to the \emph{prescription} property in TPICA, every coordinator node except for the first one will be matched with a dummy player node. By applying a derivation process similar to Theorem 1, the size of any episode in $\hat{G}$ is $2\sum_{n=1}^{\lvert \mathcal{T}\rvert}(\lvert A\rvert^{\lvert \Omega\rvert})^{n}$. Using the geometric series sum formula, the above equation is equal to $2\lvert A\rvert^{\lvert \Omega\rvert}\frac{(\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}-1}{\lvert A\rvert^{\lvert \Omega\rvert}-1}$ (i.e., $\mathcal{O}\big((\lvert A\rvert^{\lvert \Omega\rvert})^{\lvert \mathcal{T}\rvert}\big)$). Then, we compare the bases of the two results (i.e., $\frac{(\lvert \Omega\rvert-1)!}{(\lvert \Omega\rvert-\lvert \mathcal{T}\rvert)!}\lvert A\rvert$ and $\lvert A\rvert^{\lvert \Omega\rvert}$). Clearly, the exponential growth rate of the latter far exceeds the polynomial growth rate of the former.



\subsection{The Proof of Lemma 1}
\begin{lemma}\label{lemma:strategy}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and the transformed game $G^{\prime}=\emph{MPTA}(G)$. For any joint reduced pure strategy $\pi_{\mathcal{T}}$ in $G$, it can be mapped to a corresponding strategy $\pi_{t}$ in $G^{\prime}$, and vice versa.
\end{lemma}

\begin{proof}
    We can prove Lemma~\ref{lemma:strategy} by recursively traversing $G$ and $G^{\prime}$ in a depth-first pre-order manner.
    \paragraph{Case 1: from $\pi_{\mathcal{T}}$ to $\pi_{t}$.} Let $h$ and $h^{\prime}$ denote the current nodes reached in $G$ and $G^{\prime}$, respectively. $G$ satisfies the public-turn-taking, ensuring $h$ and $h^{\prime}$ either represent the same player or are both terminal nodes. Initializing $h$ with the chance player node. $\Omega$ is a set of all private information in $G$. When constructing $\pi_{t}$:


    \begin{itemize}
        \item[1)] For the chance node: $\pi_{c}=\pi_{c^{\prime}}$ always holds as our algorithm does not modify the chance node. This ensures that the actions specified by the chance player's strategy in the original game are the same as those in the transformed game.
        
        \item[2)] For opponent nodes: The proof is identical to 1).
        
        \item[3)] For team member nodes: Let $\pi_{\mathcal{T}}[I(h)]$ represent the joint reduced pure strategy of $\pi_{\mathcal{T}}$ at infoset $I(h)$. During the traversal, our algorithm expands $h$ based on all the possibly private information of teammates into $\lvert \Omega\rvert-1$ nodes. These nodes belong to the same infoset, denoted as $I^{\prime}$. Let $\pi_{t}[I^{\prime}]$ represent the reduced pure strategy of $\pi_{t}$ at $I^{\prime}$, where $I^{\prime}\in S_{t}(h)$. When $t$ is in a public state, $\pi_{\mathcal{T}}[I(h)]=\pi_{t}(I^{\prime})$.

        \item[4)] For terminal nodes: When reaching the terminal node through the above process, our algorithm ensures the following holds: $u_t^{\prime}=\sum_{i\in \mathcal{T}}u_i(h)$ and $u_{o}^{\prime}(h^{\prime}) =u_o(h)=-\sum_{i\in \mathcal{T}}u_i(h)$.
    \end{itemize}
    \paragraph{Case 2: from $\pi_{t}$ to $\pi_{\mathcal{T}}$.} The proof follows the same points as the previous case.
\end{proof}

\subsection{The Proof of Theorem 2}
\begin{theorem} \label{theorem:payoff}
    Given a public-turn-taking ATG $G$ with visibility, and its transformed game $G^{\prime}=\emph{MPTA}(G)$, they have equivalent payoffs.
\end{theorem}

\begin{proof}
    The proof directly relies on Lemma~\ref{lemma:strategy}. Specifically, for any strategy $\pi_{\mathcal{T}}$ in $G$, we can find a corresponding strategy $\pi_{t}$ in $G^{\prime}$ that yields the same payoff. Similarly, for any strategy $\pi_{t}$ in $G^{\prime}$, there exists a payoff-equivalent strategy $\pi_{\mathcal{T}}$ in $G$. This ensures that the payoff for the players remains unchanged whether they choose $\pi_{t}$ in $G^{\prime}$ or $\pi_{\mathcal{T}}$ in $G$.
\end{proof}

\subsection{The Proof of Theorem 3}
\begin{theorem} \label{theorem:equilibrium}
    Given an ATG $G$ with visibility that satisfies the public-turn-taking property, and its transformed game $G^{\prime}=\emph{MPTA}(G)$. If $(\mu_{t}^{*},\mu_{o}^{*})$ is an NE in $G^{\prime}$, then strategy $(\mu_{T}^{*},\mu_{o}^{*}): \mu_{t}^{*}\mapsto \mu_{\mathcal{T}}^{*}$ is a TMECor in $G$.
\end{theorem}

\begin{proof}
    For brevity, let $u_{t}$ and $u_{\mathcal{T}}$ represent $u_{t}(\pi_{c},\pi_{o},\pi_{t})$ and $u_{\mathcal{T}}(\pi_c, \pi_o,\pi_{\mathcal{T}})$, respectively. If $\mu_{t}^{*}$ is an NE in $G^{\prime}$, then by the definition of NE, the following holds:
    \begin{equation*}
        \mu_{t}^{*} \in \arg\max_{\mu_{t}} \min_{\mu_{o}} \sum_{\substack{\pi_c \in \Pi_c \\ \pi_o\in \Pi_o\\ \pi_t \in \Pi_t}}\mu_{c}(\pi_{c})\mu_{o}(\pi_{o})\mu_{t}(\pi_{t})u_{t}.
    \end{equation*}
    If $\mu_{\mathcal{T}}^{*}$ is a TMECor, it satisfies:
    \begin{equation*}
        \mu_{\mathcal{T}}^{*} \in \arg\max_{\mu_{\mathcal{T}}}\min_{\mu_{o}}\sum_{\substack{\pi_{c}\in \Pi_{c} \\ \pi_{o}\in \Pi_{o} \\ \pi_{\mathcal{T} \in \Pi_{\mathcal{T} }}}} \mu_{c}(\pi_{c}) \mu_{o}(\pi_{o}) \mu_{\mathcal{T}}(\pi_{\mathcal{T}}) u_{\mathcal{T}}.
    \end{equation*}
    Let $\min_{TMECor}(\mu_{\mathcal{T}})$ and $\min_{NE}(\mu_t)$ denote the inner minimization problems for TMECor and NE, respectively.

    Assume there exists a strategy $\mu_{\mathcal{T}}^{\prime}$ such that its value under the definition of TMECor is greater than that of $\mu_{\mathcal{T}}^{*}$. That is, $\min_{TMECor}(\mu_{\mathcal{T}}^{\prime}) > \min_{TMECor}(\mu_{\mathcal{T}}^{*})$.
    According to Lemma~\ref{lemma:strategy}, there exists a strategy $\mu_{t}^{\prime}$ such that $\mu_{\mathcal{T}}^{\prime}\mapsto \mu_{t}^{\prime}$. From Theorem~\ref{theorem:payoff}, we then have:
    \begin{equation*}
        \min_{TMECor}(\mu_{\mathcal{T}}^{\prime})=\min_{NE}(\mu_{t}^{\prime}) > \min_{NE}(\mu_{t}^{*}).
    \end{equation*}
    This results in a contradiction, as it implies that $\mu_{t}^{*}$ is not an NE in $G^{\prime}$. Therefore, $\mu_{\mathcal{T}}^{*}$ must be a TMECor in $G$.
\end{proof}

\section{More Details of Conceptual Explanation and The Converted Result}
\subsection{Converted Result by TPICA}
\label{appendixB1}
Figure~\ref{fig:TransformedTPICA} is the result of the TPICA transformation. Due to the large number of nodes, we only display a portion of the transformed game tree.

\begin{figure*}[t]
\centering
\includegraphics[width=1\textwidth]{TransformedTPICA.pdf}
\caption{Example of game transformation. ``\textbf{\dots}'' indicates omitted branches. The nodes of a player with the same number are in the same infoset. \textbf{Left:} Original ATG omitting the opponent. \textbf{Right:} Result of transforming the game on the left using TPICA.}
\label{fig:TransformedTPICA}
\end{figure*}

\subsection{Conceptual Explanation of Team-Public-Information Representation}
\label{appendixB2}
The left side of Figure~\ref{fig:TransformedTPICA} shows the original ATG tree, ignoring the opponent nodes, with all nodes forming the set $H$. Since every player can only observe the cards dealt to himself by the chance player, the actions of the chance player are \emph{private} to the coordinator. The actions of all other players are observable, so their actions are \emph{public} to the coordinator. In the left side of Figure~\ref{fig:TransformedTPICA}, the two nodes belonging to team member 1 are in two infosets due to the different private information at each node. For team member 2, the actions observed under the same hand are different, making each of the four nodes belonging to team member 2 a separate information set as well. The set of private information in this game is $\Omega=\{J,Q,K\}$. All leaf nodes form the set of terminal nodes $Z$.

The right side of Figure~\ref{fig:TransformedTPICA} shows the result of the game transformation using TPICA. The coordinator represents a team consisting of team members 1 and 2. The coordinator's public state is divided only by actions called \emph{public}. Therefore, the two nodes of team member 1 in the original game tree belong to the same public state. Since we omitted the branches of the chance nodes, marked by ``\textbf{\dots}'', the actual possible deals are: $\left[1:J, 2:Q\right]$, $\left[1:J, 2:K\right]$, $\left[1:Q, 2:J\right]$, $\left[1:Q, 2:K\right]$, $\left[1:K, 2:J\right]$, and $\left[1:K,2:Q\right]$. Nodes with the same private information are in the same infoset. That is, there are three infosets for team member 1 at this level, each consisting of two nodes. According to the concept of \emph{prescription}, every \emph{prescription} should select an action from the three infosets to form recommendations. Thus, there are $2^3$ \emph{prescriptions}, i.e., $aaa$, $aab$, $aba$, $abb$, $baa$, $bab$, $bba$, and $bbb$. The dummy player will select a specific action from these recommendations as the coordinator's available action. For instance, in the left subtree, the dummy player chooses the first action from each \emph{prescription}; in the right subtree, the dummy player chooses the last action from each \emph{prescription}. The same process applies to the coordinator when traversing to team member 2.

\section{Rules of Various Game Instances}
\label{appendixC}
In our work, the number of players in each game scenario is parameterized for flexibility. To clearly articulate each instance's rules, we will illustrate using the 3-player version as an example.

\begin{itemize}
    \item \textbf{The rule of \emph{Kuhn poker}:}
    In 3-player Kuhn poker, there are three players and $k$ possible cards. Players take turns acting in sequence. Before the game starts, each player pays one chip to the pot and is dealt a private card. The game proceeds with the following steps:
\begin{itemize}
    \item [1)] 
    Player 1 can choose to check or bet. If checking, the betting round continues with step 2); otherwise, the betting round proceeds to step 3).

    \item [2)]
    Player 2 can choose to check or bet. It is important to note that if Player 2 chooses to bet, then Player 1 must decide between folding or calling after Player 3's action. If Player 2 also chooses to check, the betting round continues with step 4).

    \item [3)]
    Player 2 can choose to fold or call.

    \item [4)]
    Player 3 chooses to check or bet. When Player 3 checks, the betting round ends; otherwise, Player 1 and Player 2 must decide between folding or calling.

    \item [5)]
    Player 3 chooses to fold or call. The betting round concludes after her decision.
\end{itemize}

We assume that Player 1 is the adversary, while Player 2 and Player 3 are team members. In the event of the opponent's victory, Players 2 and 3 share the loss. If the team wins, Player 2 and Player 3 share the team's rewards. The $n$-player Kuhn poker adopted in our work is an extended version based on the 3-player Kuhn poker.
    \item \textbf{The rule of \emph{Leduc poker}:} 
    In the 3-player version of adversarial team Leduc poker, the deck contains three suits and $k \geq 3$ card ranks. Each player starts by contributing one chip to the pot and receiving a private card. There are two betting rounds in total. After the first betting round, the community card is revealed. Then, players who have not folded proceed to the second betting round. After the conclusion of the second betting round, players remaining in the game will reveal their private cards. If a player pairs her card with the community card, she will win the pot.

If a player's single private card forms a pair with the community card, she will win the pot. Otherwise, the player with the highest private card wins. We assume that Player 1 is the opponent, while Player 2 and Player 3 are team members. In the adversarial team games, there are some modifications to the payoff structure. If Player 1 wins, she takes all the chips from the pot. If Player 2 or Player 3 wins, the chips contributed by the team members are returned to them, and the chips bet by Player 1 are evenly distributed among each team member.

    \item \textbf{The rule of \emph{Goofspiel}:}
\emph{Goofspiel} is a bidding game.  We adopt a variant version with three cards. Every player has a hand of cards with values $\{1, 2, 3\}$. A third stack of cards, also with values $\{1, 2, 3\}$, is shuffled and placed on the table. At the onset of each round, a neutral referee places a card on the table as the reward for that round. Players bid by selecting a card from their hand, and the player with the highest bid claims the reward. In the event of a tie, the reward is equitably shared among the tying players. After three rounds, all rewards are distributed among the players, contributing to their scores. Assuming that Player 2 and Player 3 form a team, the final score of each team member is calculated by summing and averaging the rewards obtained by Player 2 and Player 3.
\end{itemize}

\end{document}
