\begin{paperonly}
\begin{abstract}
We present algorithms of two flavors---one rooted in constraint satisfaction problems (CSPs) and the other in learning dynamics---to compute pure-strategy Nash equilibrium (PSNE) in $k$-dimensional congestion games ($k$-DCGs) and their variants. %In $k$-DCGs, the weights or demands of the players are $k$-dimensional vectors. 
The two algorithmic approaches are driven by whether or not a PSNE is guaranteed to exist.
We first show that deciding the existence of a PSNE in a $k$-DCG is NP-complete even when players have binary and unit demand vectors. %We then focus on computing PSNE for $k$-DCGs and their variants. 
%with general, linear, and exponential cost functions. 
For general cost functions (potentially non-monotonic), we devise a new CSP-inspired algorithmic framework for PSNE computation, leading to 
%, which leads to the first configuration-space algorithm 
%to compute a PSNE if one exists. For this most general case, our 
algorithms that run in polynomial time %in various model parameters 
%the number of players, maximum number of strategies, and maximum demand for a resource 
%when the number of resources and $k$ are bounded 
under certain assumptions while offering exponential savings over standard CSP algorithms. We further refine these algorithms for variants of $k$-DCGs. 
Our experiments demonstrate the effectiveness of this new CSP framework for hard, non-monotonic $k$-DCGs.
We then provide learning dynamics-based PSNE computation algorithms for linear and exponential cost functions. These algorithms run in polynomial time under certain assumptions. For general cost, we give a learning dynamics algorithm for 
%constructive proof of existence for 
an $(\alpha, \beta)$-approximate PSNE (for certain $\alpha$ and $\beta$). 
%, where $\alpha$ and $\beta$ are multiplicative and additive approximation factors, respectively. 
Lastly, we also devise polynomial-time algorithms for structured demands and cost functions. 
%, giving polynomial-time algorithms to compute PSNE for various cases. 
%Finally, we empirically demonstrate that the configuration-space framework can be effective for hard, non-monotonic instances of $k$-DCGs.
\end{abstract}
\end{paperonly}

\begin{paperonly}
\section{Introduction}
In non-cooperative games, a player's payoff depends on their own choice of action and the choices of actions by the other players. 
In general, the payoff may change depending on \emph{who} chose a particular action. %That is, the identities of the players matter in general. 
%For example, in a 2-player game with binary actions $\{1, 2\}$, players 1 and 2 choosing actions 1 and 2, respectively is not the same as players 1 and 2 choosing actions 2 and 1, respectively.
In a seminal paper, Rosenthal presented a special class of games---to become famously known as \emph{congestion games} later---where the number of players rather than the identities of the players choosing an action is relevant \citep{rosenthal_class_1973}.
%with two main features: (1) each player's payoff depends on \emph{how many players}, not necessarily \emph{who}, chose a particular action, and (2) the payoff function is identical for all players \citep{rosenthal_class_1973}. Later on, this class of games became famously known as \emph{congestion games}. 
In a congestion game, there is a set of resources (e.g., edges in a road network). Each player has a set of strategies, where each strategy is a subset of resources (e.g., paths in a network). A strategy profile consists of a strategy for each player. The cost of a resource (e.g., edge) is a function of the number of players using that resource. Given a strategy profile, a player's cost is the sum of the costs of the resources used by the player. A strategy profile is a pure-strategy Nash equilibrium (PSNE) if no player has an incentive to deviate unilaterally.
%can unilaterally decrease their cost by choosing a different strategy.

%XXXXXX Leaving this history bit-- good for journal
\iffalse
\cite{rosenthal_class_1973} proved the existence of a PSNE in any congestion game using a mathematical program. A solution to the program (which always exists) corresponds to a PSNE. %However, a small example shows that a PSNE does not always correspond to a solution to the program. 
While Rosenthal was mainly interested in the existence of PSNE in what we now call congestion games, the vast implication of his work would only be realized much later.
In particular, \cite{potential_games} defined a class of games called \emph{potential games} where each potential game has a global (i.e., not player specific) potential function, which can ``track'' the fluctuation in any player's payoff caused by that player's deviations.
%In one variant, e.g., the difference in a player's payoff due to a unilateral deviation is exactly the same as the difference in the potential function for that deviation. 
Using the objective function of \citeauthor{rosenthal_class_1973}'s program, they showed that every congestion game is a potential game. More intriguing still, every finite potential game can be mapped to a congestion game. % \citep{potential_games}.

In another seminal paper, \cite{potential_games} defined a class of games called \emph{potential games} and showed that every congestion game is a potential game, and more interestingly, every finite potential game is a congestion game.
The two seminal papers \citep{rosenthal_class_1973,potential_games} spurred an incredible amount of interest in congestion and potential games from various disciplines, including computer science, mathematics, operations research, economics, and telecommunication, to name a few.
%\citep{yamamoto2015comprehensive,gonzalez2016survey}. %In fact, research on this topic has been growing so rapidly that it would not be possible to mention all related papers here. Instead, we next review 
%Due to the vast literature, we only review the most relevant works so that we can concisely situate our work. 
\fi 
\end{paperonly}

\begin{paperonly}
%\noindent \textbf{Related Work}\\
%\subsubsection*{Related Work}
The congestion games literature can be divided into three main frontiers: unweighted, weighted one-dimensional, and weighted multidimensional. Unweighted congestion games are the classical ones \citep{rosenthal_class_1973}, where the guaranteed existence of a PSNE naturally leads to computational questions. In their seminal work, \cite{potential_games} showed that any unweighted congestion game is a \emph{potential game}, which is appealing for learning dynamics. For unweighted congestion games on networks, if the game is \emph{symmetric} (same start-end pair for all), then there exists a polynomial-time network-flow algorithm to find a PSNE; otherwise, the problem is PLS-complete \citep{fabrikant_complexity_2004}, even for linear cost \citep{ackermann2008impact}. %On the other hand, polynomial-time algorithms for correlated equilibria are known \citep{papadimitriou_correlated_2008}.

\begin{table*}[htp]
%\caption{Results for Determining the Existence of and Computing a PSNE for Variants of $k$-Dimensional Congestion Games ($k$-DCGs)}
\caption{Our main results on $k$-dimensional congestion games ($k$-DCGs), $k$-class congestion games ($k$-CCGs), and variants.
Notation: NPC $\equiv$ NP-Complete, $n$ = \# players, $m$ = \# resources, $p$ = max \# strategies, $\mathbf{d}_i$ = player $i$'s demand vector, $\mathbf{d}_N = \sum_i \mathbf{d}_i$, $w_{\max} = \max_j \mathbf{d}_{N_j}$, $\check{n} =$ max \# players selecting a resource in 
%$ \max_{j} $ \{\# of players $i$ with $d_{i_j} = 1$\} in 
a binary $k$-DCG, or max \# players of a type in a $k$-DCG with player types, $l(i)$ = nonzero-element index in $\mathbf{d}_i$ for $k$-CCG,  $a_{\max}$, $b_{\max}$, and $\mathbf{z}$ are cost parameters.\\ 
$\dagger$ We give approximation algorithms for $(\alpha,\beta)$-PSNE, which always exists. \hspace{2mm} $\ddagger$ \cite{klimm_equilibria_2022}.}
\begin{center}
\begin{tabular}{|p{0mm}p{2mm}|c|c|c|}  \hline
& & Problem & PSNE & Time Complexity to Determine or Compute PNSE \\ \hline
\multirow{4}{*}{\begin{turn}{90}CSP \hspace{0.5mm} \end{turn}}& \multirow{4}{*}{\begin{turn}{90} framework \hspace{0.5mm} \end{turn}} & General Cost $k$-DCG & NPC$\dagger$  & $\mathcal{O}\left((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km})\right)$ \\  
& & Subclass: Binary $k$-DCG & NPC & $\mathcal{O}\left(\check{n}^{km}(nkp^2m^2 + \min \{ nkmp\check{n}^{km}, n^{km+1} p \})\right)$ \\
& & Subclass: $k$-CCG & NPC & $\mathcal{O}\left((w_{\max})^{km}(np^2m^2 + n k p m (w_{\max})^{m})\right)$ \\ 
& & Subclass: $k$-DCG with player types & NPC & $\mathcal{O}\left((\check{n})^{\tau m}(np^2m^2 + n \tau p m (\check{n})^{m}) + \tau n k\right)$ \\ \hline
\multirow{4}{*}{\begin{turn}{90} Learning \hspace{5mm} \end{turn}}& \multirow{4}{*}{\begin{turn}{90} dynamics \hspace{4mm} \end{turn}} & Linear Cost $k$-DCG & Always$\ddagger$ & $\mathcal{O}\left(nkpm^2 \times n^2 m (a_{\max} + b_{\max}) \frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}\right)$ \\
& & Linear Subclass: Binary $k$-DCG & Always & $\mathcal{O}\left(nkpm^2 \times n^2 m (a_{\max} + b_{\max}) \big( k \max_j z_j \big)^2\right)$ \\ 
& & Linear Subclass: $k$-CCG & Always & $\mathcal{O}\left(nkpm^2 \times n^2 m (a_{\max} + b_{\max}) \frac{\max_j z_j^2}{\min_j z_j} \frac{\max_i d_{i,l(i)}^2} {\min_i d_{i, l(i)} } \right)$ \\
& & Exponential Cost $k$-DCG & Always$\ddagger$ & $\mathcal{O}\Big(nkpm^2 \times \frac{e}{e-1} \big( m \exp(\mathbf{z} \cdot \mathbf{d}_N )  a_{\max}  + nm  b_{\max} \big)\Big)$\\ \hline 
\multirow{3}{*}{\begin{turn}{90}Struct-\end{turn}}
& \multirow{4}{*}{\begin{turn}{90} \hspace{2mm} tured  \end{turn}}  & Ordered $\mathbf{d}_i$'s, nondec. cost, singleton strt. & Always & $\mathcal{O}(n \log n + nmk)$\\
& & Ordered $\mathbf{d}_i$'s, nondec. cost, shared strt. & Always & $\mathcal{O}(n \log n + npmk)$\\
& & Structured cost, singleton strt. & Always & $\mathcal{O}(n \log n + nmk)$\\ \hline
%\\
%& & \\ \hline 
%Binary $k$-DCGs & NP-Complete &  \\ 
%& &  \\ \hline 
%$k$-Class DCGs & NP-Complete & \\ 
%\\ 
\end{tabular}
\end{center}
\mylabel{tab:summary}
\end{table*}%


%Unweighted congestion games being very much a settled matter, much attention has been given to weighted (one-dimensional) congestion games. Here, 
In a weighted congestion game, each player has a weight or demand, and the cost of a resource is a function of the sum of the demands of the players using that resource. Unlike unweighted congestion games, a PSNE is not guaranteed to exist in weighted congestion games   %. 
%For example, in a symmetric network congestion game, when the cost function of some edges is linear and that of other edges is 2-wise linear, a PSNE may not exist 
\citep{libman_atomic_1997,fotakis_selfish_2005}. \cite{dunkel_complexity_2008} went one step further and showed that PSNE existence in weighted congestion games is strongly NP-complete, even for a constant number of players. %the symmetric case -- strongly NP-complete  

On the positive side, a PSNE is guaranteed to exist in a weighted congestion game when the cost function is linear \citep{fotakis_selfish_2005} %,panagopoulou_algorithms_2007} 
%. It is also guaranteed to exist for exponential cost functions 
or exponential
\citep{harks_existence_2012}. \cite{harks2011characterizing} characterized the existence of potential functions. 
%Prior to these general results, 
Special cases involving parallel edges have also received attention \citep{milchtaich_congestion_1996,fotakis2002structure,gairing2004computing,mavronicolas_congestion_2007}. 
%Approximate equilibrium computation has also gained traction lately \citep{caragiannis2021approximate,christodoulou2023existence}.

%\mohammad{Delete this paragraph?} The hardness of PSNE in weighted congestion games has led to lower and upper bounding the existence of $\alpha$-PSNE, which denotes that no player can improve their payoff by more than an $\alpha$ factor by unilateral deviation. There are non-existence results for $\alpha \approx 1.153$ \citep{hansknecht_et_al:LIPIcs:2014:4700} and more recently, $\alpha =  \Tilde{\Theta}(\sqrt{d})$ for polynomial cost functions of degree $d$ \citep{christodoulou2023existence}. We also have existence results for $\alpha = d$ \citep{caragiannis2021approximate} for degree $d$ and $\alpha = n$ for general cost functions that are monotonic \citep{christodoulou2023existence}.

%A recent frontier %in the study of congestion games 
%is 
Multidimensional congestion games are a very recent frontier which we investigate here. Introduced by \cite{klimm_congestion_2014}, this class of games is a generalization of weighted congestion games where the demand of each player is a $k$-dimensional vector. Very recently, \cite{klimm_equilibria_2022} have 
%characterized the existence of PSNE in multidimensional congestion games. In particular, they show that the sets of certain affine functions and the sets of certain exponential functions are the only sets of cost functions 
shown that certain affine and exponential cost functions are the only ones 
for which 
%a multidimensional congestion game is guaranteed to have 
a PSNE exists for sure. Their characterization leads to the following \emph{computational} questions investigated here:
%\begin{center}
\emph{How can we compute a PSNE (if it exists) in multidimensional congestion games and their variants? How hard is this computation?}
%\end{center}

These questions are motivated by many real-world applications. % in a variety of multiagent systems. %and the multiagent systems community's interest in solving computationally hard problems. 
Advances in multidimensional congestion games may contribute to richer traffic models that account for the heterogeneity in vehicles (e.g., weight, length, etc.). Such multidimensional models were envisioned by transportation researchers many decades ago \citep{dafermos1972traffic} and are now topics of active investigation \citep{van2008fastlane,pi2019general,wang2019multiclass}. Computational advances 
%in multidimensional congestion games 
may also contribute to various other application areas---wireless networks \citep{yamamoto2015comprehensive}, distributed systems \citep{k_dim_real_world_example_1,k_dim_real_world_example_2}, telecommunication \citep{altman2006survey}, and smart grids \citep{fadlullah2011survey}, to name just a few. 

%A series of papers investigated the existence of PSNE \citep{harks2011characterizing,harks_existence_2012,klimm_congestion_2014,schutz_congestion_2016,klimm_equilibria_2022} showed that pure NE always exist if the set of cost functions are a certain affine or exponential function.
%We are unaware of any algorithms specifically for multi-dimensional congestion games, although algorithms do exist for the more general class of aggregate games.

%Note that some papers unrelated to this work (i.e., not cited here), refer to a differing model as ``multidimensional congestion games''.
%In this other model players are partitioned into clusters, unlike our paper where players have multi-dimensional weights.

%We briefly note that there is minor branch of congestion games known as non-atomic congestion games, where the players individually do not have much impact on the cost function.

\iffalse
\noindent\textbf{Aggregate Games} (also called aggregative games and summarization games) are a broad set of games that generalize multi-dimensional congestion games.
Much of the work comes from the economics literature \citep{jensen2010aggregative,acemoglu2013aggregate,corchon1994comparative}, but there are some papers with a computational focus \citep{cummings_privacy_2015,kearns_bounded_influence,koshal2016distributed}.
Theorem 3 in Cummings \textit{et. al.} is in some regards similar to our theorem \ref{thm:exact_enum}, but differs in a few ways.
Notably they assume bounded influence and compute approximate NE, where we have no such assumption and compute exact NE.
This is indicative of differences between this paper and aggregative games more generally, where the latter not infrequently considers bounded influence and approximate NE (e.g., \citep{kearns_bounded_influence}).
\fi 
%
%nonatomic-brief mention
%
% Applications
%is hard to find a comprehensive survey, except a few domain-specific surveys; e.g., smart grids \citep{fadlullah2011survey}, wireless networks \citep{yamamoto2015comprehensive}, and 
\end{paperonly}

\begin{paperonly}
\subsubsection*{Our Contributions}
Driven by whether or not a PSNE is guaranteed to exist, we take two fundamentally different computational approaches inspired by CSPs and learning dynamics. The CSP approach can handle \emph{any} $k$-DCGs (for which a PSNE may not exist), whereas learning dynamics can handle certain $k$-DCGs with a PSNE.
%Table \ref{tab:summary} summarizes our main results. %, where we exploit the structure of $k$-DCGs and their variants.  
We exploit the structure of multidimensional congestion games to give new computational insights into $k$-dimensional congestion games ($k$-DCGs) and their variants, as summarized in Table \ref{tab:summary}. 

For general $k$-DCGs, we devise a CSP %constraint satisfaction problem (CSP) 
whose dual decouples the players' strategies. %cost functions, for which a PSNE may not exist, 
We give algorithms that utilize this decoupling and run in polynomial time under certain assumptions (Section \ref{sec:general}).
\footnote{
Polynomial-time algorithms are unlikely due to PLS-completeness results for 
%Even computing PSNE for unweighted, linear-cost network congestion games is PLS-complete %of 
unweighted network congestion games with linear cost functions
\citep{ackermann2008impact}.} 
\emph{To our knowledge, this CSP framework is new within the rich congestion games literature spanning over five decades.}
The significance of our CSP framework lies not only in the exponential savings it offers compared to well-known CSP algorithms but also in its applicability beyond congestion games. %Our experiments on it are also very encouraging.

%A second computational approach we take is learning dynamics (Section \ref{sec:learning}). 
For linear and exponential cost, we give iterative learning dynamics algorithms for $k$-DCGs and their variants by deriving and bounding weighted potential functions based on the structure of the game (Section \ref{sec:learning}). 
For general cost, we show that for certain $\alpha$ and $\beta$, there is always an $(\alpha, \beta)$-approximate PSNE that can be computed via learning dynamics. We also give polynomial-time algorithms for structured costs and demands (Section \ref{sec:structured}).

The significance of our computational results can be best understood against the backdrop of hardness results (Section \ref{sec:complexity}). We show that deciding the existence of a PSNE in a $k$-DCG is NP-complete %for very special cases.
even for binary demand vectors (and other special cases). %Also, deciding the existence of a PSNE in a $k$-CCG where the binary demand vectors are unit vectors is NP-complete. %In contrast, there is a PSNE in every unweighted congestion game. 
Put together, this paper addresses computational questions while giving new insights for provably hard problems on congestion games.


%The variants include $k$-DCGs with player types---a natural setting that did not get much attention before, $k$-DCGs with binary demand vectors, and $k$-class congestion games ($k$-CCGs)---a special case of $k$-DCGs. %where each player's demand vector has exactly one positive element, the rest being zeros.



%We devise computational schemes for $k$-DCGs and  variants with general, linear, and exponential cost functions. Their running times, as shown in Table~\ref{tab:summary}, are polynomial under certain assumptions.

\iffalse
For general cost (potentially non-monotonic) $k$-DCGs, we give an algorithmic framework that explores the configuration space (i.e., the space of aggregated demand vectors) 
%xxxxxx  , which can be considered the dual to the space of strategy profiles) 
for computing a PSNE. \emph{To our knowledge, we are the first to present this framework within the extremely rich congestion games literature.} It works by exploring the possible aggregated demand vectors of all players and verifying (non-trivially) whether an aggregated demand vector can lead to a PSNE. % by computing and combining feasible player strategies with respect to the aggregated demand vector. 
For variants of $k$-DCGs, such as $k$-DCGs with player types, $k$-DCGs with binary demands, and $k$-CCGs, we exploit their structures in this framework. %derive different time complexities.
Our empirical evidence shows the promise of this new framework for non-monotonic $k$-DCGs.





%Why are these results significant?
Together, these results push the envelope of the state-of-the-art in multidimensional congestion games by devising computational schemes for hard problems in multiagent systems. %, which is at the core of AI research.

%Together, these results 
%\section{Related Work}
%TBD

%\jared{Moved these here from the end of section 2.}
The paper is organized as follows. We start with the complexity results in Section~\ref{sec:complexity}. The four subsequent sections concern algorithms design for general, structured, linear, and exponential cost functions, respectively. We conclude with a discussion of empirical evidence and future directions. 
\fi 

%Complete proofs are in the \textit{Appendix}. Code and data are in supplementary materials. As a final note before the technical content, we make the standard assumption that demands are non-negative integer vectors \citep{panagopoulou_algorithms_2007,dunkel_complexity_2008,christodoulou2023existence}.

\end{paperonly}




\begin{paperonly}
\section{Preliminaries}
\mylabel{sec:prelim}

%In this section, we %provide notations for this paper and 
We formally define multi-dimensional congestion games and related game-theoretic terms. %, as introduced by \citep{klimm_congestion_2014,klimm_equilibria_2022}, and provide related game-theoretic notations.
%\jared{Remove ``introduced by'' it is covered in related work.}
Roughly speaking, a multi-dimensional congestion game is a natural generalization of weighted congestion games where the weight or demand of each player is a multidimensional vector. The cost of each resource is a function of the aggregated demands of the players using that resource.

%of high dimensions 
%of high dimensions 
%maybe add a sentence or two about motivation here
%Multi-dimensional congestion games are a natural generalization of the classical congestion game \citep{rosenthal_class_1973} when different agents impact congestion in a network differently.

More formally, a $k$-dimensional congestion game ($k$-DCG) consists of a set $N = \{1, \dots, n\}$ of $n$ players and a set $R = \{1, \dots, m\}$ of $m$ resources. 
Each player $i \in N$ has two elements: (1) a strategy set $S_i \subseteq 2^{R}\setminus\{\emptyset\}$, defined to be subsets of resources that $i$ can select  %where $2^R$ denotes the power set of $R$. 
and %Each player $i \in N$ has 
(2) a $k$-dimensional demand vector $\mathbf{d}_i = (d_{i_1}, ..., d_{i_k}) \in \mathbb{R}^k$,  consisting of the weight or demand of player $i$ at each dimension $1, ..., k$. 
%, with a slight abuse of notation, 
Each resource $r \in R$ has a cost function $c_r: \mathbb{R}^k \to \mathbb{R}$ 
%be the $k$-dimensional cost function for $r$ 
that maps $k$-dimensional real-valued vectors to real numbers. We use $p = \max_{i \in N} |S_i|$ to denote the maximum number of strategies for any player. We make the standard assumption that demands are non-negative integer vectors \citep{panagopoulou_algorithms_2007,dunkel_complexity_2008,christodoulou2023existence}.

Given a strategy profile $\mathbf{s} = (s_1, ..., s_n) \in S = S_1 \times ... \times S_n$ of $n$ players, let $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N : r \in s_i} \mathbf{d}_i$ be the aggregated $k$-dimensional demand vector of the players who select resource $r$ under the strategy profile $\mathbf{s} \in S$. Naturally, given a strategy profile $\mathbf{s}$, 
%$ = (s_1, ..., s_n) \in S$, \jared{(We do not need to say $\mathbf{s} = (s_1, ..., s_n) \in S$ twice, but can instead say $\mathbf{s}\in S$)}
%\in S_1 \times ... \times S_n$, 
the cost function of player $i$ is defined to be $\pi_i(\mathbf{s}) = \pi_i(s_i, \mathbf{s}_{-i}) = \sum_{r \in s_i} c_r(\mathbf{x}_r(\mathbf{s}))$, i.e., the sum of the costs of the resources selected by player $i$ under $s_i$, given others' strategies $\mathbf{s}_{-i}$. 
%\begin{align*} 
%\end{align*}
 

%For convenience, we sometimes specify a $k$-dimensional congestion game 
%Therefore, a $k$-DCG is a tuple $(N, R, \{S_i, \mathbf{d}_i\}_{i \in N}, \{c_r\}_{r \in R}, k)$. 
%Therefore, 1-DCG refers to a standard instance of a weighted congestion game. 

We are interested in computing PSNE in $k$-DCGs and their variants listed below. We present these variants with motivating examples from the domain of load balancing in distributed systems \citep{k_dim_real_world_example_1,k_dim_real_world_example_2,k_dim_real_world_example_3}. $k$-DCGs naturally model various dimensions of user demands in distributed systems, such as bit rates, latency, error tolerance, and throughputs. 
\begin{itemize}
    \item $k$-DCGs with binary demand vectors $\mathbf{d}_i \in {\{0, 1\}}^k$   $\forall i$. Example: data flow in distributed systems can be short-lived or long-lived, bursty or deterministic, etc.
    
    \item $k$-class congestion games ($k$-CCGs), where each demand vector has one positive element, the rest being zeros. Example: different use-cases (each with its own traffic pattern), 
    such as streaming, video conferencing, web browsing, etc.
    
    \item $k$-DCGs with player types, where players of the same type are characterized by the same demand vector. Example: categories of traffic on a campus network: VPN, student access, scientific computation, etc.
\end{itemize}

We next define PSNE and approximate PSNE-- two solution concepts of our interest.

%insert PSNE definition as definition environment
\begin{definition} (Pure-Strategy Nash Equilibrium (PSNE)) \mylabel{def:psne}
A strategy profile $\mathbf{s}^* = (s^*_1, ..., s^*_n) \in S$ is a pure-strategy Nash equilibrium (PSNE) in a $k$-DCG if and only if for each player $i \in N$ and any $s'_i \in S_i$, we have that 
%\begin{align*}
$\pi_i(\mathbf{s}^*) \le \pi_i(s'_i, \mathbf{s}^*_{-i}).$ 
%\end{align*}
\end{definition}

%As we will show in Section \ref{sec:complexity}, determining the existence of a PSNE is NP-complete. Therefore, we are also interested in the existence of an approximate PSNE under the following generalized definition.
%\jared{Do you mean, ``We are also interested in the existence of an approximate PSNE.''}
%We are also interested in approximate PSNE defined below. % under the following generalized definition.
\begin{definition} (($\alpha, \beta$)-PSNE) \mylabel{def:alpha_beta_psne}
A strategy profile $\mathbf{s} = (s_1, ..., s_n) \in S$ is an ($\alpha, \beta$)-approximate PSNE in a $k$-DCG for some $\alpha \ge 1$ and $\beta \ge 0$ if and only if for each player $i \in N$ and any $s'_i \in S_i$, we have that 
%\begin{align*}
$\pi_i(\mathbf{s}) \le \alpha \pi_i(s'_i, \mathbf{s}_{-i}) + \beta$. 
%\end{align*}
When we mention $\alpha$-PSNE (without $\beta$), we mean $\beta = 0$.
\end{definition}

% We will use the following definition in Section~\ref{sec:learning} for designing algorithms based on potential functions.
\end{paperonly}

\begin{paperonly}
%For the remainder of the paper, we assume that the representations of (1-dimensional) weighted congestion games (1-W) use integer weights, as Dunkel and Schulz did when showing that determining the existence of PSNE in 1-W to be NP-Complete \citep{dunkel_complexity_2008}.
%We also assume that those weights are represented as unsigned binary numbers and each number uses the same number of bits.

%\hau{we need to say about the representation size; the number of players plus the number of actions for each player; maybe it is not important now}



%\hau{we need to be consistent about how we call the cost function; do we say congestion function? delay function for each resource?}


\subsubsection*{Constraint Satisfaction Problem (CSP)}
A CSP is specified by a set of \emph{variables}, a \emph{domain} for each variable, and a set of \emph{constraints}, each constraint being over a subset of variables known as its \emph{scope}. A CSP asks us to assign a value to each variable from their respective domains so that all the constraints are satisfied. A wide range of problems, such as Boolean satisfiability, map coloring, scheduling, and even PSNE computation in games, can be modeled as CSPs \citep{dechter2003constraint,gottlob2003pure}. 

We often represent the structural information of a (primal) CSP using a \emph{primal constraint network}, where each node represents a variable, and each edge connects two variables that appear together in a constraint (potentially with other variables). As a result, unless the constraints are binary, we cannot identify the scope of a constraint just by looking at the primal constraint network. 

A CSP also has a \emph{dual constraint network}, where each variable represents a constraint, and each edge connects two constraints with shared variables in their scopes and is labeled with these shared variables.
%\jared{(Could we add one sentence right here about the advantage of a dual CSP?)}
The dual constraint network leads to the \emph{dual CSP}, where the domain of each dual variable is computed as follows: Consider its corresponding primal constraint and assign values to the scope of the primal constraint to satisfy it. Such assignments constitute the domain of the dual variable. Furthermore, the dual CSP enforces the edge-wise dual constraint that each primal variable shared between any two dual variables must have the same value in both. Therefore, the dual CSP is a reformulation of the primal CSP and contains only binary constraints.

%The CSP literature is extremely rich with algorithms that utilize the primal and dual constraint networks \citep{dechter2003constraint}. These algorithms often work by ordering the variables in a judicious way and exploring partial assignments to a subset of variables with the goal of arriving at a complete and satisfying assignment if it exists. 



\end{paperonly}

\begin{appendixonly}
    \setcounter{section}{2}
\end{appendixonly}
 
\section{Computational Complexity} 
\mylabel{sec:complexity}


\begin{paperonly}
We show that deciding the existence of a PSNE in special variants of $k$-DCGs is NP-complete. The NP-hardness of general $k$-DCGs is not surprising because determining a PSNE in weighted congestion games (i.e., when $k = 1$) is already strongly NP-complete \citep{dunkel_complexity_2008}. 

What is surprising is that we show that determining the existence of a PSNE in $k$-DCGs is NP-complete even when each player $i$'s $k$-dimensional demand vector $\mathbf{d}_i$ is a binary vector (even a unit vector) for some polynomially bounded $k$. %That is, $\mathbf{d}_i \in \{0, 1\}^k$. 
%Our result resolves a question for determining the complexity of computing a PSNE for $k$-DCG. 
%\jared{Which open question is this referring to? The only open question I'm aware of for $k$-DCG is the one in \citep{klimm_equilibria_2022} and they are looking for ``a complete characterization of the cost functions that guarantee the existence of pure Nash equilibria,'' not complexity results.}
In sharp contrast, there is always a PSNE in unweighted (1-dimensional) congestion games \citep{rosenthal_class_1973}. Furthermore, 
%relevant to our study of $k$-DCG, 
if the players have the same demand vector, the game is guaranteed to have a PSNE by reducing it to an unweighted congestion game. We have the following result.

%In the following NP-hardness reduction, we reduce the standard weighted congestion games to $k$-DCGs with binary demand vectors. 
%xxxxxxx delete the next line if space is an issue
%We then argue that a PSNE in one game can be translated into a PSNE of another. 

%We do this by leveraging the idea of isomorphism, which allows us to argue the equivalence of PSNE of two games. %any two isomorphic games.  
%%More specifically, 
%Below is the definition of isomorphic games given by %Monderer and Shapley 
%\citet{potential_games}. 
%
%\begin{definition}[Isomorphic Strategic Games] \mylabel{def:isomorphism}
%Two strategic games $B = (N, (A_i, \pi_i)_{i \in N})$ and $B' = (N, (\Hat{A}_i, \Hat{\pi}_i)_{i \in N})$ are isomorphic if there exist bijections $g_i: A_i \rightarrow \Hat{A}_i$ for all $i \in N$ such that $\pi_i(a_1, \cdots, a_n) = \Hat{\pi}_i(g_1(a_1), \cdots, g_n(a_n))$ for every $\mathbf{a} \in \mathbf{A}, i \in N$.
%\end{definition}
%
%\begin{lemma}
%We need to add a lemma or something that say that when two games are isomorphic, then they have the same sets of PSNE ... 
%\end{lemma}

%\hau{I don't know about for the remainder of this paper; we don't need for hardness as it is for an instance which implies it is true for any instances whether it is integer weights or not}

%Recall the definition of isomorphic games as given by \cite{potential_games}. 




%\jared{These theorems are equivalent. Which do you like better?
%Klimm and Schultz use the ``unweighted'' term so I lean toward that.}

%\hau{In the following result, $k$ is polynomially bounded whereas Theorem 4.4 is not} 
\end{paperonly}

\begin{appendixonly}
    \setcounter{theorem}{2}
\end{appendixonly}

\begin{theorem} \mylabel{theorem:sublinear}
Deciding the existence of a PSNE in a $k$-DCG is NP-complete even when the demand vector $\mathbf{d}_i$ of each player $i \in N$ is a binary vector and $k$ is sublinear in the number of players. 
That is, $\mathbf{d}_i \in \{0,1\}^k$ for all $i$ and $k = \mathcal{O}(\log n)$. 
\end{theorem}

\begin{paperonly}
\begin{proofsketch}
The problem is in NP because verifying that a 
strategy profile $\mathbf{s}^* \in S$ is a 
PSNE takes polynomial time.
%For each of the $n$ players we must only check $|S_i|$ strategies each for $i \in N$ to ensure that $s_i^*$ is a best response.
%This can be done in some polynomial of the game.
For NP-hardness, we reduce from weighted congestion games \citep{dunkel_complexity_2008}.
Given a weighted congestion game we construct a $k$-DCG with identical sets of players, resources, and actions.
In the $k$-DCG game we give the players binary demand vectors equivalent to the binary representations of the integer weights from the weighted congestion game.
% For example, if a player in the weighted congestion game has the integer weight of 5, then the equivalent player in the $k$-DCG game would have a demand vector of 101.
The length of the demand vector is set to $k = \lfloor \log \max_{i \in N} \widetilde{d}_i \rfloor + 1$ where $\widetilde{d}_i$ is the integer weight of player $i$ in the weighted congestion game.
%This $k$ allows the constructed $k$-DCG game to accommodate the largest player integer weight.
Finally, we construct cost functions for $k$-DCGs that we show to yield the same cost given the same strategy profile for all players.
%Because of this, 
Therefore, a strategy profile is a PSNE in one game if and only if it is a PSNE in the other game.
%Finally, this whole construction can be done in polynomial time.
\end{proofsketch}
\end{paperonly}

\begin{appendixonly}
\begin{proof}
We first show that the problem is in NP. 
In particular, given any strategy profile $\mathbf{s} \in S$, we show that it can be verified in polynomial time that the profile is a PSNE. 
Observe that according to Definition \ref{def:psne}, it is sufficient to check the potential deviation of each player $i \in N$. 
Since each player $i \in N$ has $|S_i|$ strategies and there are $n$ players, the verification takes at most some polynomial of the representation of the game.\footnote{
Note that there are special types of 1-DCG (e.g., network congestion games) with more compact representations \citep{dunkel_complexity_2008}. 
The verification question for these types of games is in NP, and our hardness proof still holds for them.} 
%(i.e., the number of players, the sizes of the strategy sets, and the demands)
%Because these special types and NP membership arguments can be directly defined for and applied to $k$-DCGs, respectively, our hardness result/reduction can be applied more generally for these types of $k$-DCGs. %considered in \citep{}. 
%}. 

Next, we show that the considered problem is NP-hard by reducing from the problem of determining a PSNE in a weighted congestion game, which is known to be strongly NP-hard even for weighted network congestion games \citep{dunkel_complexity_2008}. 

More specifically, given a 1-DCG $\left(\check{n}, \widetilde{R}, \{\widetilde{S}_i, \widetilde{d}_i\}_{i \in \check{n}}, \{\widetilde{c}_r\}_{r \in \widetilde{R}}\right)$ with integer weights/demands bounded by some polynomial  in the number of players (i.e., via Strongly NP-hardness), we construct a $k$-DCG $(N, R, \{S_i,$ $\mathbf{d}_i\}_{i \in N}, \{c_r\}_{r \in R}, k)$ via the following: 
\begin{itemize}
\item Let $N = \check{n}$ be the same set of $n$ players; 
\item Let $R = \widetilde{R}$ be the same set of $m$ resources; 
\item For each player $i \in N$, let $S_i = \widetilde{S}_i$ be the same set of strategies; 
\item Let $k = \lfloor \log \max_{i \in N} \widetilde{d}_i \rfloor + 1$ be the maximum length of the binary representation; 
\item For each player $i \in N$, let $\mathbf{d}_i = (d_{i_k}, ..., d_{i_1})$ be the binary demand vector induced by the binary representation of $\widetilde{d}_i$; % of length $k$; 
\item For each resource $r \in R$ and $\mathbf{x} = (x_k, ..., x_1) \in \{0,1, ..., n\}^k$, we define 
%\begin{align*}
$c_r(\mathbf{x}) = \widetilde{c}_r\left(\sum_{j=1}^k 2^{(j-1)} x_{j} \right)$. 
%\end{align*}
\end{itemize}

%Following from the above construction, 
Then for all $r \in R$, $\mathbf{s} = (s_1, ..., s_n) \in S$, and $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N; r \in s_i} \mathbf{d}_i$, %we have 
\begin{align*}
c_r&(\mathbf{x}_r(\mathbf{s})) = c_r\left(\sum_{i \in N; r \in s_i} \mathbf{d}_i\right)  \\
&= c_r\left(\sum_{i \in N; r \in s_i} (d_{i_k}, ..., d_{i_1})\right) \\
&= \widetilde{c}_r\left(\sum_{j=1}^k 2^{(j-1)} \sum_{i \in N; r \in s_i} d_{i_j}\right) \\
&= \widetilde{c}_r\left(\sum_{i \in N; r \in s_i} \sum_{j=1}^k 2^{(j-1)} d_{i_j}\right) \\
&= \widetilde{c}_r\left(\sum_{i \in N; r \in s_i} \widetilde{d}_i\right) 
= \widetilde{c}_r\left( \widetilde{x}_r(\mathbf{s}) \right), 
\end{align*}
where $\widetilde{x}_r(\mathbf{s}) =  \sum_{i \in N; r \in s_i} \widetilde{d}_i$, the first equality is by the definition of $\mathbf{x}_r(\mathbf{s})$, the second equality is by the definition of $\mathbf{d}_i$, the third equality is by our construction of the cost function, the fourth equality is by moving the sum over dimensions inside, and the fifth equality is because $\mathbf{d}_i$ is the binary representation of  $\widetilde{d}_i$ by our construction. 

Because of the above equivalence and the fact that $\pi_i(\mathbf{s}) = \sum_{r \in s_i} c_r(\mathbf{x}_r(\mathbf{s})) = \sum_{r \in s_i} \widetilde{c}_r(\widetilde{x}_r(\mathbf{s})) = \widetilde{\pi}_i(\mathbf{s})$ for all $r \in R$ and $\mathbf{s} \in S$, %it is not hard to verify that 
there is a PSNE in the 1-DCG instance if and only if it is a PSNE in the $k$-DCG instance. 

% for each x in the domain of widetilde{c}_r, we define binary of x of length k and assign it to c_r
% k should be the sum of the d_i 

% Extra Note: not in the proof
% In Theorem 3.1 of the strongly NP-hard reduction, they have 3m (each with a weight of the three partition instance numbers) + 3 players (with weights \mathcal{O}(m*B) where B is the budget; so their n = 3m + 3 (which is the same as our n); 
% from Garey and Johnson, they did a strongly NP-hard (page 99) proof to 3-partition from 4-partition where the max item weight from the 4-partition is 2^16 * |A|^4 (instance size)
% so here B = \mathcal{O}(|A|^4) is at most |A|* 2^16 * |A|^4 
% from a 4 partition to a 3 partition of size 4n, the 3 partition has 24n^2 - 3n elements and weight for each element is 4*(5B + s(a_i))+ 1  which is about \mathcal{O}(|A|^4) and B = \mathcal{O}(|A|^4)

We finally note that the reduction can be done in polynomial time. 
First, because the reduced 1-DCG instance is strongly NP-hard,  %the number of players and 
the demands are polynomially bounded by $n$. 
%(as the integers from the 3-partition instance used in the 1-DCG strongly NP-hard reduction are bounded by the number of integers \citep{Garey:1990aa}).
Hence, $k = \mathcal{O}(\log n)$ is sublinear in the number of players by constructions. 
Second, converting an integer $x$ to its binary representation can be done in $\mathcal{O}(\log x)$ by repeatedly dividing $x$ and storing its remainders. 
Thus, constructing $\mathbf{d}_i$'s can be done in polynomial time. 
\end{proof}
\end{appendixonly}
%Note (1): converting an integer $x$ to its binary representation can be done in $\mathcal{O}(\log x)$ by repeating dividing $x$ and storing its reminders. 
%Note (2): $k$ is bounded by $\mathcal{O}(\log n)$ 


%(where $\mathbf{d}_i = (d_{i,1})$) 


%to a $k$-DU game $(N, \{S_i, \mathbf{d}^\prime_i\}_{i \in N}, \{c^\prime_r\}_{r \in R})$.


%\begin{theorem}
%Determining the %existence of PSNE %in $k$-dimensional %congestion games is %NP-complete even %when the game is %unweighted.
%\end{theorem}
  

%\hau{fix this proof to include (1) strongly NP-hard, (2) polynomial bounded, (3) poly computation time}

%\hau{need to show, under the new function, if there are two subsets of agents that sum up to the same value under the transformation, then it must be the case the value of the new function maps to some common value of the sum of the same  un-transformed value of the two subsets; That is, show the function is well-defined}

%\begin{proof}
%Verifying a strategy profile $\mathbf{a}$ is a PSNE can be done in polynomial time, because each player $i \in N$ need only check $|A_i| - 1$ deviating strategies.
%
%Now we reduce a 1-W game $G = (N, \{S_i, \mathbf{d}_i\}_{i \in N}, \{c_r\}_{r \in R})$ (where $\mathbf{d}_i = (d_{i,1})$) to a $k$-DU game $G^\prime = (N, \{S_i, \mathbf{d}^\prime_i\}_{i \in N}, \{c^\prime_r\}_{r \in R})$.
%
%
%Let $k$ be equal to the number of bits used to represent weights in $G$.
%Let $d^\prime_{i,j}$ be equal to the $j$th bit of $d_{i,1}$ for $i \in N, j \in [k]$.
%Let $c^\prime_r(s) = c_r(f(\mathbf{x}^\prime_r(s)))$ for $r \in R$ \mohammad{left hand side seems a bit inconsistent because the cost function is defined as $c_r: \mathbb{R}^k \to \mathbb{R}$. Write $c^\prime_r(x'_r(s))$?}.
%Let $f(\mathbf{x}^\prime_r(s)) = \sum_{j \in [k]} 2^{j - 1} * x_{r,j}$, where $x_{r, j}$ is the $j$th element of $\mathbf{x}^\prime_{r,j}(s)$.
%It is clear to see that $f$ converts a $k$-dimensional vector of integers to a single integer.
%
%Furthermore that $\mathbf{x}_r(s) = f(\mathbf{x}^\prime_r(s))$ for all $r \in R, s \in S$, because addition has the commutative property (\textit{i.e.} It does matter if all the powers of two represented in an unsigned integer are first summed together or if every $j$th power of two is first summed with every power of two of the same $j$.).
%This implies that for every player $i \in N$ and strategy profile $s \in S$, $i$ will receive the same utility whether the game is $G$ or $G^\prime$, therefore $G$ and $G^\prime$ are isomorphic.
%Which means if a strategy profile $s \in S$ is a PSNE for one, it must be for the other, hence if an algorithm can determine the existence of a PSNE for $G^\prime$ then it can for $G$ as well.
%\end{proof} 

%\hau{note that we can also use the same paper to show that our setting is hard for symmetric action space + $c_{ri}$ player specific cost functions + parallel link}

%\jared{Alternatively we can use this theorem.
%The difference is that given a 1-W with a representation size of $\mathcal{O}(n)$, the isomorphic $k$-DU and $k$-CU representation sizes are $\mathcal{O}(\textbf{}n)$ and $\mathcal{O}(n \log n)$ respectively.}

\begin{paperonly}
Next, we investigate whether PSNE computation is easier for restricted demands. %for more restricted settings. 
%the problem of computing a PSNE will be easier if we further restrict the demands. 
Unfortunately, even when the binary demand vector is a unit vector, the problem remains hard. 
%(i.e., $\mathbf{d}_i \in \{ \mathbf{x} \in \{0,1\}^k; \sum_{j=1}^k x_j = 1\}$), the problem of computing a PSNE is hard. Note that this makes the game a $k$-CCG with binary demand vectors. %Please see the Appendix for proof.
\end{paperonly}

\begin{theorem}
Deciding the existence of a PSNE in a $k$-DCG (or a $k$-CCG) is NP-complete even when the demand vector $\mathbf{d}_i$ of each player $i \in N$ is a binary unit vector and $k$ is linear of the number of players. 
That is, $\mathbf{d}_i \in \{ \mathbf{x} \in \{0,1\}^k; \sum_{j=1}^k x_j = 1\}$ for all $i$ and $k = \mathcal{O}(n)$. 
%Determining the existence of PSNE in $k$-dimensional congestion games is NP-complete even when the game is \jared{$k$-class} unweighted.
\end{theorem}

\begin{paperonly}
\begin{proofsketch}
%Following the same argument as Theorem \ref{theorem:sublinear}, 
The problem is clearly in NP. The NP-hardness reduction is from weighted congestion games. % \citep{dunkel_complexity_2008} to our problem. %from the problem of determining a PSNE in a weighted congestion game, which is known to be strongly NP-hard even for weighted network congestion games . 
% Details are in the Appendix.
\end{proofsketch} 
\end{paperonly}

\begin{appendixonly}
\begin{proof} 
%Following the same argument as Theorem \ref{theorem:sublinear}, 
The problem is clearly in NP because we can verify a PSNE in polynomial time.

To show the problem is NP-hard, we reduce from the problem of determining a PSNE in a weighted congestion game, which is known to be strongly NP-hard even for weighted network congestion games \citep{dunkel_complexity_2008}. 

More specifically, given a 1-DCG $(\overline{N}, \overline{R}, \{\overline{S}_i, \overline{d}_i\}_{i \in \overline{N}}, \{\overline{c}_r\}_{r \in \overline{R}})$ with integer weights/demands bounded by some polynomial  in the number of players (i.e., via Strongly NP-hardness), we construct a $k$-DCG $(N, R, \{S_i,$ $\mathbf{d}_i\}_{i \in N}, \{c_r\}_{r \in R}, k)$ via the following: 

\begin{itemize}
\item Let $N = \overline{N}$ be the same set of $n$ players; 
\item Let $R = \overline{R}$ be the same set of $m$ resources; 
\item For each player $i \in N$, let $S_i = \overline{S}_i$ be the same set of strategies; 
\item Let $k = n$ be the number of dimensions corresponding to the number of players; 
\item For each player $i \in N$, we let $\mathbf{d}_i$ 
%= (0, ..., 1, 0, ..., 0)$ 
to be the binary unit demand vector of player $i$ of length $k$ where all entries are zero except the $i^{th}$ entry; 
\item For each resource $r \in R$ and $\mathbf{x} = (x_1, ..., x_k) \in \{0,1\}^k$, we define 
\begin{align*}
c_r(\mathbf{x}) = \overline{c}_r\left(\sum_{j=1}^k \overline{d}_j x_{j} \right). 
\end{align*}
\end{itemize}

Following from the above construction, for all $r \in R$, strategy profile $\mathbf{s} = (s_1, ..., s_n) \in S$, and $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N; r \in s_i} \mathbf{d}_i$, we have  
\begin{align*}
c_r(\mathbf{x}_r(\mathbf{s})) &= c_r\left(\sum_{i \in N; r \in s_i} \mathbf{d}_i\right)  \\
&= c_r\left(\sum_{i \in N; r \in s_i} (d_{i_k}, ..., d_{i_1})\right) \\
&= \overline{c}_r\left(\sum_{j=1}^k \overline{d}_j \sum_{i \in N; r \in s_i} d_{i_j}\right) \\
&= \overline{c}_r\left(\sum_{j=1}^k \overline{d}_j \mathbbm{1}[ r \in s_j ] \right) \\
%&= \overline{c}_r\left(\sum_{i \in N; r \in s_i} \sum_{j=1}^k 2^{(j-1)} d_{i_j}\right) \\
&= \overline{c}_r\left(\sum_{i \in N; r \in s_i} \overline{d}_i\right) \\
&= \overline{c}_r\left( \overline{x}_r(\mathbf{s}) \right), 
\end{align*}
where $\overline{x}_r(\mathbf{s}) =  \sum_{i \in N; r \in s_i} \overline{d}_i$, $\mathbbm{1}[ r \in s_j ]$ is an indicator function that returns 1 if the condition is true or 0 otherwise, 
the first equality is by the definition of $\mathbf{x}_r(\mathbf{s})$, the second equality is by the definition of $\mathbf{d}_i$, the third equality is by our construction of the cost function, the fourth equality is by noting that the demand of each player is zero for the $j^{th}$ entry except player $j$ for each dimension $j$ in our construction, and the fifth equality is because we account for players that use $r$. 
%$\mathbf{d}_i$ is the binary representation of  $\overline{d}_i$ by our construction. 

Because of the above equivalence and the fact that $\pi_i(\mathbf{s}) = \sum_{r \in s_i} c_r(\mathbf{x}_r(\mathbf{s})) = \sum_{r \in s_i} \overline{c}_r(\overline{x}_r(\mathbf{s})) = \overline{\pi}_i(\mathbf{s})$ for all $r \in R$ and $\mathbf{s} \in S$, it is not hard to verify that there is a PSNE for one game if and only if it is a PSNE for the other. 
\end{proof}
\end{appendixonly}

\begin{paperonly}
%\begin{proof}
%Given a weighted congestion game, create $k=n$ one for each player, for each player $i$ create a demand vector of all zeros except the $i$ entry, which has a value of one.     
%\end{proof}



%\hau{The isomorphism idea doesn't seem to work too nicely; I think we should switch to like reduction; I think the next step is to see how we can reduce K-Dim to 1-dim to solve the problem or come up with algorithms directly for this; if reduction is not possible, we can start with congestion games with positive results}
%
%
%\hau{Q1: Is it possible to reduce unweighted k-class congestion games to 1-D weighted congestion games at all?}
%\hau{not always I think; if $d_i$ is unique none overlapping, we can}
%
%
%
%
%
%
%\hau{after this theorem; say that in general k-dim congestion games are isomorphic to weighted games; and then present corollaries of existing of PSNE and cite the Klimm paper; then we can say that in general, there are many ways to construct isomorphic games and the question is about what is the best isomorphic games? the reason is that there are many algorithms for congestion games and we can simply convert k-dim congestion games to congestion games; but the runtime of these algorithms depends on game representations; for instance list some algorithms that depend on something}

%\section{}

%Approach \#1: do isomorphism conversion; then apply existing results for congestion games

%first can we show finding the best isomorphism representation is hard under some natural objectives? what about finding approximate isomorphism representation? how hard is this? any algorithms? 

%another question: Given a k-dim congestion games and a congestion game, can we determine whether they are isomorphic or not? 

%The opt problem looks like: Given a k-dim congestion games, find a 1-dim congestion isomorphic games that minimize the sum of the weights of the players (maybe use the decision-version of this)

%maybe show it as hard as showing graph isomorphism or show that it is np-hard using subgraph isomorphism  

\input{charts/tikz_graphs}
\end{paperonly}


\section{General Cost: A CSP Approach} \mylabel{sec:general}
\begin{paperonly}
We can formulate the PSNE computation problem in a $k$-DCG as a CSP, which consists of (1) a variable for each player, (2) the domain of a variable being the corresponding player's strategy set, and (3) a best-response constraint for each player $i$, representing $i$'s best responses $s_i$ to any $\mathbf{s}_{-i}$. % if the player is playing their best response to all other players. 

%As illustrated in Fig.~\ref{fig:csp}(a) and (b), the nature of the $n$-ary best-response constraints means that not only the primal as well as the dual constraint networks are complete networks but also all the players appear on each edge labeling in the dual network.
As illustrated in Fig.~\ref{fig:csp} (a) and (b), the nature of the $n$-ary best-response constraints means that both the primal and the dual constraint networks are complete networks.
Furthermore, \emph{all} players appear on each edge of the dual network.
This portrays a grim picture because it is hard to design efficient algorithms without decoupling the players' strategies. %A straightforward approach to 
For example, one solution approach is to 
%enumerate the set of strategy profiles  $S = S_1 \times ... \times S_n$ and 
check each strategy profile for a PSNE by verifying Definition \ref{def:psne}.  
%This approach is exponential in the number of players. 
%For instance, if
Letting $p = \max_{i \in N} |S_i|$, % be the maximum number of strategies of any player, 
this approach takes $\mathcal{O}(np^{n+1})$ time, which is exponential in the number of players. 

The grave computational implication of not decoupling the players' strategies leads us to a key technical insight.
Instead of using the above CSP, we first construct a different CSP for $k$-DCGs and then consider its dual. In the new CSP, the variables are the players and the \emph{configuration} $Y$ of the game. The domain of each player $i$ is their strategy set $S_i$ and that of $Y$ is the set of all $k$-dimensional aggregated demand vectors for $m$ resources, $\mathbf{y} \equiv (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$. There are $n$ binary constraints, each capturing a player's \emph{best response to a configuration}. We use the structure of $k$-DCGs to define such best responses: For any configuration $\mathbf{y}$, a player $i$'s best-response strategies are $s_i \in S_i$ that minimize the cost $\sum_{r \in s_i} c_r(\mathbf{y}_r)$. There is an additional \emph{feasibility constraint} that enforces that the strategy profile $\mathbf{s}$ assigned to the players leads to the aggregated demand vectors $(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ assigned to $Y$; i.e., $\mathbf{x}_1(\mathbf{s}) = \mathbf{y}_1, \mathbf{x}_2(\mathbf{s}) = \mathbf{y}_2, ..., \mathbf{x}_m(\mathbf{s}) = \mathbf{y}_m$. An example of the primal constraint network for this CSP is shown in Fig.~\ref{fig:csp}(c) and its dual in Fig.~\ref{fig:csp}(d). Most notably, as elaborated in the next paragraph, the dual CSP allows us to decouple the players' strategies from each other. 

\emph{To our knowledge, this dual CSP, which grounds our algorithmic framework, has not been studied in the congestion games literature before.} %Our algorithmic framework uses this dual CSP. 
To formalize this dual CSP, each dual node is a primal constraint. So, there is a dual node $v_{i,Y}$ for each player $i$'s best response to the configuration variable $Y$, and there is one dual variable $v_{N, Y}$ for the feasibility constraint making sure that the strategies assigned to the players lead to the aggregated demands assigned to $Y$ (see Fig.~\ref{fig:csp}(d)). For each $i \in N$, there is an edge between $v_{N, Y}$ and $v_{i,Y}$ labeled with the shared variables $i, Y$. For any $i \neq j \in N$, there is an edge between $v_{i, Y}$ and $v_{j, Y}$ labeled with the shared variable $Y$. Unlike the straightforward dual (Fig.~\ref{fig:csp}(b)), this new dual (Fig.~\ref{fig:csp}(d)) decouples the players' strategies by virtue of not having all the players appear together on any edge. 

%XXXXXX Can be left out:
We devise algorithms based on this dual CSP. As described in Section~\ref{sec:prelim}, each dual variable has a domain consisting of satisfying assignments for the corresponding primal constraint, and the edges in the dual constraint network lead to dual constraints that ensure that the shared primal variables across any edge are assigned the same value in both endpoints of the edge.

Before presenting our algorithmic framework, we show that the dual CSP 
%(see Fig.~\ref{fig:csp}(d)) 
has a solution if and only if there is a PSNE. To see why, note that the assignments $(s_i, \mathbf{y})$ made to the $v_{i,Y}$ variables capture the players' best responses to $\mathbf{y}$, and the edge label between any two $v_{i,Y}$ and $v_{j,Y}$ variables enforces sharing the same $\mathbf{y}$ in these assignments. Furthermore, the assignment $(\mathbf{s}, \mathbf{y'})$ made to the $v_{N, Y}$ variable makes sure that the strategy profile $\mathbf{s}$ leads to the configuration $\mathbf{y}'$, and the labels on the edges connecting $v_{N, Y}$ to $v_{i, Y}$ enforce that $\mathbf{s} = (s_1, \cdots, s_n)$ and $\mathbf{y} = \mathbf{y}'$. 

Our algorithmic framework consists of two procedures. Procedure 1 computes the domains of each $v_{i,Y}$ dual variable and Procedure 2 searches for a solution using the computed domains. 
%XXXXX Add if space permits
\iffalse 
Said differently, in Procedure 1, we are given a configuration and asked to find the best responses of the players with respect to that configuration. We are not yet worried about the feasibility of the given configuration. In Procedure 2, we are given a configuration and asked to pick a best response for each player (computed in Procedure 1) so that the resulting strategy profile leads to aggregated demands that exactly match the given configuration.
\fi 
% assuming weight vectors are positive integer vectors
%Recall that in Section \ref{sec:complexity}, we show that determining the existence of a PSNE in multi-dimensional congestion games is NP-complete. 
\iffalse
A straightforward approach to determining a PSNE in any game is to enumerate the set of strategy profiles  (i.e., $S = S_1 \times ... \times S_n$) of $n$ players and check whether each strategy profile is a PSNE (by verifying Definition \ref{def:psne}).  
However, this approach is exponential in the number of players. 
For instance, if $p = \max_{i \in N} |S_i|$ is the maximum number of strategies of any player, then the above enumeration approach would take $\mathcal{O}(np^{n+1})$ time.  
%Therefore, our goal is to investigate whether we can develop an algorithm that is more efficient for determining a PSNE in $k$-DCGs. 
In this section, we propose a computational framework that explores the space that results from strategy profiles, instead of exploring the space of strategy profiles. 
%dual space to the space of strategy profiles. This dual space, which 
We call the space of our interest the \emph{configuration space}, which consists of the aggregated demand vectors of all the players. \emph{To our knowledge, we are the first to devise this computational framework for congestion games.}
\fi 
%
As a preview, our algorithms are polynomial in $n$ (the number of players), $p$ (the maximum number of strategies for any player), and a maximum weight term when $k$ (number of dimensions) and $m$ (number of resources) are bounded. %and some natural parameters of the multi-dimensional congestion game. 
This is useful %for congestion games in which there is a constant number 
when the number of resources and strategies is constant but the number of players can be large. 
In fact, even with a constant number of players, determining PSNE existence in a weighted congestion game is already strongly NP-complete \citep{dunkel_complexity_2008}. 



%Before presenting the algorithm, we first observe key characteristics of $k$-DCGs and define necessary terms. 


%XXXXXXX
%The following can provide a CSP-agnostic proof of correctness
\iffalse
Recall that given a strategy profile $\mathbf{s} = (s_1, ..., s_n) \in S$ and $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N; r \in s_i} \mathbf{d}_i$, the cost function of player $i \in N$ is defined to be 
%\begin{align*} 
$\pi_i(\mathbf{s}) = \pi_i(s_i, \mathbf{s}_{-i}) = \sum_{r \in s_i} c_r(\mathbf{x}_r(\mathbf{s}))$, 
%\end{align*}
which is the sum of the costs of the resources selected by player $i \in N$ under $s_i$ given others' strategies $\mathbf{s}_{-i}$. 
Observe that the aggregated demand vector $\mathbf{x}_r(\mathbf{s})$ for each $r \in R$ is sufficient for each player to evaluate their best responses. 
%, and with abuse of notation, 
%we have that $\pi_i(\mathbf{s}) = \pi_i(\mathbf{x}_1(\mathbf{s}), \mathbf{x}_2(\mathbf{s}), ..., \mathbf{x}_m(\mathbf{s}))$. 
Therefore, in any PSNE $\mathbf{s}^* \in S$, there must be some configuration (i.e., $k$-dimensional aggregated demand for each resource) $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \mathbb{R}^k$ such that 
$\mathbf{x}_1(\mathbf{s}^*) = \mathbf{y}_1, \mathbf{x}_2(\mathbf{s}^*) = \mathbf{y}_2, ..., \mathbf{x}_m(\mathbf{s}^*) = \mathbf{y}_m$, and $\mathbf{s}^*$ represents mutual best responses of all the players. 
Thus, our main idea is to (1) explore possible configurations $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$ and (2) verify whether a configuration can be mapped to some mutual best responses.
% (2) verify whether they can be mapped to mutual best responses, implying a PSNE. %a strategy profile that is a PSNE. 
\fi 

As Fig~\ref{fig:csp}(d) shows, $Y$ is shared across all edges. Therefore, we parameterize our algorithms by any configuration given as input. This leads to the question of how many configurations there can be.
The demand vector $\mathbf{d}_i = ({d}_{i_1}, ..., {d}_{i_k})$ of each player $i$ being an integer vector (standard assumption \citep{dunkel_complexity_2008}), %\footnote{This is a standard assumption \citep{panagopoulou_algorithms_2007,dunkel_complexity_2008,christodoulou2023existence}.} 
we define $w_j = \sum_{i \in N} d_{i_j}$ for each $j=1, ..., k$. Letting $w_{\max} = \max_{j \in [k]} w_j$, we have $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \{0, ..., w_{\max}\}^k$. 
%we have $\mathbf{y}_r \in \{0, ..., w_{\max}\}^k$ for each $r \in R$.
Thus, we only need to consider at most $(w_{\max} + 1)^{km}$ or $\mathcal{O}((w_{\max})^{km})$ configurations. We are now ready for the algorithms. %\hl
%, described for any configuration given as input.
%(i.e., $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \{0, ..., w_{\max}\}^k$). Given each configuration, we present the following framework to verify whether if it is a PSNE.
%XXXX Makes sense? Omit if not



\iffalse
Our configuration-space framework consists of two building blocks, which we name Procedure 1 and Procedure 2. In Procedure 1, we are given a configuration and asked to find the best responses of the players with respect to that configuration. We are not worried about the feasibility of the given configuration. In Procedure 2, we are given a configuration and the best responses of the players with respect to  that configuration and asked if it is possible to pick a best-response strategy for each player so that the resulting strategy profile leads to aggregated demands on the resources which exactly match the given configuration. Said differently, Procedure 2 is concerned with the feasibility of a given configuration with respect to the best responses of the players.
\\
\fi 
\end{paperonly}

\begin{appendixonly}
For the ease of reading, we repeat the description of Procedures 1 and 2 of the CSP approach from the main text. %The pseudocode is provided in Algorithm~\ref{alg:dynamic}.
\end{appendixonly}

\subsubsection*{Procedure 1: Compute Domains of Dual Variables $v_{i,Y}$}  %Best Responses for Configurations}\\
Given a configuration $\mathbf{y} \equiv (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, where each $\mathbf{y}_j \in \{0, ..., w_{\max}\}^k$, we compute the set of strategies for each player $i$ that makes $i$ ``happy" under the configuration. 
To do this, abusing the notation $\pi_i$ slightly, we define and compute, for any $i \in N$, $s_i, s_i' \in S_i$, and $s_i \not= s_i'$,
%
\begin{flalign}
%\pi_i (s_i, \mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m) 
&\pi_i (s_i, \mathbf{y}) = \sum_{r \in s_i} c_r(\mathbf{y}_r) \nonumber \\%\mylabel{eq1} \\ 
&\pi_i (s_i, \mathbf{y}, s_i') = \sum_{r \in s_i' \cap s_i} c_r(\mathbf{y}_r) + 
          \sum_{r \in s_i' \setminus s_i} c_r(\mathbf{y}_r+\mathbf{d}_i)  \nonumber \\%\mylabel{eq2} \\
% first, for those r in s_i, y_r - d_i
&BR_i (\mathbf{y}) = \{s_i \in S_i \; | \;   \forall
%xxxxxx need this? s_i \not= 
s_i' \in S_i, 
\pi_i (s_i, \mathbf{y}) \le \pi_i (s_i, \mathbf{y}, s_i') \}  \nonumber %\mylabel{eq3} %\mathbf{y}_r+\mathbf{d}_i \in \{0, ..., w_{\max}\} \text{ for each } r \in s_i' \setminus s_i\},
\end{flalign}
%Equation~(\ref{eq1}) 
The first equation calculates player $i$'s cost. %gives the cost of player $i$ when playing strategy $s_i$ under the configuration $\mathbf{y}$.
%Equation~(\ref{eq2}) 
The second calculates player $i$'s cost when deviating from $s_i$ (under $\mathbf{y}$) to $s_i'$. 
% (in which we need to 
% subtract $\mathbf{d}_i$ from $\mathbf{y}_r$ from each $r \in s_i$ and 
% increase $\mathbf{y}_r$ by $\mathbf{d}_i$ for each $r \in s_r'$). 
$BR_i(\mathbf{y})$ computed in the last equation 
%Equation~(\ref{eq3}) 
is the set of $i$'s best responses to $\mathbf{y}$. Therefore, the domain of $v_{i,Y}$ is the union of sets $\{(s_i, \mathbf{y}) \; | \; s_i \in BR_i(\mathbf{y})\}$ for all $\mathbf{y}$.

%Note that %here we only compute the domain of $v_{i,Y}$ and not of 
We deliberately do not compute the domain of $v_{N,Y}$ (the dual variable for the primal feasibility constraint) 
%, which consists of tuples of strategy profile $\textbf{s}$ and configuration $\mathbf{y}$ arising from $\textbf{s}$ without the need for $\mathbf{s}$ best responding to $\mathbf{y}$. As a result, the domain of $v_{N,Y}$ 
because it may contain numerous strategy profiles that are not PSNE. 
%, even though they lead to certain configurations. 
We next show in Procedure 2 how we can search for a PSNE without explicitly computing the domain of $v_{N,Y}$.\\
\\
\subsubsection*{Procedure 2: Search for PSNE} %Checking Feasibility of Configurations}\ \\
Given a configuration $\mathbf{y} \equiv (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, a PSNE under it is a strategy profile 
%Given $BR_i (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ for each player $i \in N$, we check to see if it is possible to find a strategy profile using them that sums up to $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$. That is, find 
$\mathbf{s} = (s_1, ..., s_n)$
such that (1) $(s_i, \mathbf{y})$ is in the domain of $v_{i,Y}$ for each player $i$, and (2) $\mathbf{x}_1(\mathbf{s}) = \mathbf{y}_1, \mathbf{x}_2(\mathbf{s}) = \mathbf{y}_2, ..., \mathbf{x}_m(\mathbf{s}) = \mathbf{y}_m$.
%\in BR_1 (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m) \times ... \times BR_n (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ such that $\mathbf{x}_1(\mathbf{s}) = \mathbf{y}_1, \mathbf{x}_2(\mathbf{s}) = \mathbf{y}_2, ..., \mathbf{x}_m(\mathbf{s}) = \mathbf{y}_m$. 
The first condition enforces players' best responses to $\mathbf{y}$, while the second condition enforces the feasibility constraint. 
We get the following general result.

\iffalse
We have presented Procedures 1 and 2 at a somewhat abstract level. This opens up many potential avenues for algorithm design and implementation to compute a PSNE (if there exists one) in a $k$-DCG. In this paper, we focus on exact algorithms.  
%ic implementation. Given algorithms for Procedures 1 and 2, we can compute a PSNE in a $k$-DCG, if there exists one. 
We present the following general result.
\fi 

\begin{theorem} \mylabel{thm:exact_enum}
For any $k$-DCG, 
there is an algorithm to determine the existence of a PSNE in $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$. 
The algorithm is polynomial in $n$, $p$, and $w_{\max}$, when $m$ and $k$ are constants.
%\jared{(Shouldn't it be) The algorithm is polynomial in $n$, polynomial in $p$, and pseudo-polynomial in $w_{\max}$, when $m$ and $k$ are constants.}
\end{theorem}

\begin{appendixonly}
\begin{proof}
For each configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \{0, ..., w_{\max}\}^k$, we perform Procedures 1 and 2 to verify and construct if there is any PSNE that is consistent with the configuration. 

Regarding Procedure 1, for each player $i \in N$, computing $BR_i$ takes at most $\mathcal{O}(kp^2m^2)$ time for each configuration. This is because the first equation %Equation (\ref{eq1}) 
takes $\mathcal{O}(m)$ time for a given configuration and $s_i$, and the second equation %Equation (\ref{eq2}) 
takes $\mathcal{O}(km^2)$ for a given configuration, $s_i$, and $\overline{s}_i$. 
Thus, this procedure's overall running time is $\mathcal{O}(nkp^2m^2)$ for all players. 

Procedure 2 can be done efficiently using dynamic programming, where we (1) first order the players $1, ..., n$ and (2) create a binary table $T_i(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) \in \{0,1\}$ for each $\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m \in  \{0, ..., w_{\max}\}^k$ of size $\mathcal{O}((w_{\max})^{km})$ for each player $i$. We first initialize $T_0(\mathbf{0}, ..., \mathbf{0}) = 1$ where we have an all zero configuration. %$\mathbf{0}$ is a vector of all zero configuration. 
We then define $T_{i}(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) = 1$ if and only if there is $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m$ such that 
$T_{i-1}(\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m) = 1$ and, for some $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, 
$\mathbf{y}'_r = \overline{\mathbf{y}}_r + \mathbbm{1}[r \in s_i] \mathbf{d}_i$ for each $r \in R$. 
Table $T_i$ can be constructed by looking at all the 1's entries of $T_{i-1}$ and adding the player demand vector to the corresponding resources for each $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$. 

Because there are at most $\mathcal{O}((w_{\max})^{km})$ configurations, and each of the $n$ players has at most $p$ strategies with size at most $m$ and at most $k$ dimensions, 
the time for this procedure is at most $\mathcal{O}(nkmp(w_{\max})^{km})$. 
To verify whether a given $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$ can be achieved, one can check if $T_n(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ is 1, in which case a corresponding PSNE can be constructed via the standard tracing back procedure of dynamic programming. 

The total time (Procedures 1 and 2) to check a given configuration can be formed as a PSNE is $\mathcal{O}(nkp^2m^2 + nkmp(w_{\max})^{km})$. 
Thus, to verify all configurations, the total time is $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$, which is polynomial in $n$, $p$, and $w_{\max}$, when $m$ and $k$ are constants. 

If there is such a strategy profile for some configuration, then the game has a PSNE. 
Otherwise, the game does not have any PSNE. 
The reason is that each PSNE must correspond to some configuration, and we enumerate each configuration to search for a PSNE. 
For a given configuration that corresponds to a PSNE, each player $i$'s equilibrium strategy must be in $BR_i$ because $BR_i$ contains all strategies in which player $i$ does not have any incentive to deviate to other strategies from the configuration. 
We note that there are some configurations that might not be feasible (e.g., some $\mathbf{y}_r$ that are too small or $\mathbf{y}_r + \mathbf{d}_i$ outside of the $w_{\max}$). The above procedure would eliminate them when searching for a PSNE, thereby removing configurations that are not consistent with any strategy profiles. 
\end{proof}
\end{appendixonly}

\iffalse
\begin{appendixonly}
\begin{algorithm}
\SetAlgoLined
\SetNlSkip{0.3em}
\KwIn{A multidimensional congestion game}
\KwOut{\texttt{TRUE} if a pure-strategy Nash equilibrium exists, else \texttt{FALSE}.}
    \For{ Configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \{0, ..., w_{\max}\}^k$} {
        Create a binary table $T_0$ and set $T_0(0, ...,0) = 1$.
        \For{Every player $i \in \{1, 2, \cdots, n\}$} {
            Create a binary table $T_i$. \\
            \For {Each $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m \in T_{i-1}$ such that $T_{i-1}(\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m) = 1$} {
                \For{$s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$} {
                    Set $T_{i}(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) = 1$ where $\mathbf{y}'_r = \overline{\mathbf{y}}_r + \mathbbm{1}[r \in s_i] \mathbf{d}_i$ for each $r \in R$ \\
                }
            }
        }
        \If{$\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in T_n$} {
        return \texttt{TRUE}
        }
    }
    return \texttt{FALSE}
\caption{Determine if there is a pure strategy Nash equilibrium.}
\mylabel{alg:dynamic}
\end{algorithm}
\end{appendixonly}
\fi 

\begin{paperonly}
\begin{proofsketch}
%For each configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m \in \{0, ..., w_{\max}\}^k$, we perform Procedures 1 and 2 to verify and construct if there is any PSNE that is consistent with the configuration. 
%
%Regarding Procedure 1, for each player $i \in N$, computing Equation (\ref{eq1}), Equation (\ref{eq2}), and $BR_i$ takes at most $\mathcal{O}(m)$, $\mathcal{O}(km^2)$, and $\mathcal{O}(kp^2m^2)$, respectively. 
Procedure 1 runs in $\mathcal{O}(nkp^2m^2)$ for all players. Procedure 2 can be done efficiently using dynamic programming (DP), where we (1) first order the players $1, ..., n$ and (2) create a binary table $T_i(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) \in \{0,1\}$ for each $\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m \in  \{0, ..., w_{\max}\}^k$ of size $\mathcal{O}((w_{\max})^{km})$ for each player $i$. We first initialize $T_0(\mathbf{0}, ..., \mathbf{0}) = 1$ where we have an all zero configuration. %$\mathbf{0}$ is a vector of all zero configuration. 
We then define $T_{i}(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) = 1$ if and only if there is $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m$ such that 
$T_{i-1}(\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m) = 1$ and for some $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, 
$\mathbf{y}'_r = \overline{\mathbf{y}}_r + \mathbbm{1}[r \in s_i] \mathbf{d}_i$ for each $r \in R$. 
Table $T_i$ can be constructed by looking at all the 1 entries of $T_{i-1}$ and adding the player demand vector to the corresponding resources for each $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$. The DP runs in $\mathcal{O}(nkmp(w_{\max})^{km})$.
\end{proofsketch} 


Algorithm~\ref{alg:dynamic} presents the decision version of the DP algorithm given in the proof of Theorem~\ref{thm:exact_enum}. Please note that we are going to refine it for variants of $k$-DCGs.

\begin{algorithm}
\SetAlgoLined
\SetNlSkip{0.3em}
\KwIn{A multidimensional congestion game}
\KwOut{\texttt{TRUE} if a PSNE exists, \texttt{FALSE} otherwise.}
    \For{configuration $\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_m \in \{0, \cdots, w_{\max}\}^k$} {
        \For{each player $i \in \{1, 2, \cdots, n\}$} {
            Compute $BR_i (\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_m)$
        }
        Create a binary table $T_0$ with $T_0(\mathbf{0}, \cdots,\mathbf{0}) = 1$\\
        \For{each player $i \in \{1, 2, \cdots, n\}$} {
            Create a binary table $T_i$ as follows: \\
            \For{each $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, \cdots, \overline{\mathbf{y}}_m$ such that $T_{i-1}(\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, \cdots, \overline{\mathbf{y}}_m) = 1$} {
                \For{$s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_m)$} {
                    Set $T_{i}(\mathbf{y}'_1, \mathbf{y}'_2, \cdots, \mathbf{y}'_m) = 1$ where $\mathbf{y}'_r = \overline{\mathbf{y}}_r + \mathbbm{1}[r \in s_i] \mathbf{d}_i$ for each $r \in R$ \\
                }
            }
        }
        \If{$T_n(\mathbf{y}_1, \mathbf{y}_2, \cdots, \mathbf{y}_m) = 1$} {
        return \texttt{TRUE}
        }
    }
    return \texttt{FALSE}
\caption{Determine if there is a PSNE}
\mylabel{alg:dynamic}
\end{algorithm}

\end{paperonly}


%XXXXXXXXXXXX
%which may have far-reaching implications beyond this paper


%
%Because there are at most $\mathcal{O}((w_{\max})^{km})$ configurations, and each of the $n$ players has at most $p$ strategies with size at most $m$ and at most $k$ dimensions, 
%the time for this procedure is at most $\mathcal{O}(nkmp(w_{\max})^{km})$. 
%To verify whether a given $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$ can be achieved, one can check if $T_n(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ is 1, in which case a corresponding PSNE can be constructed via the standard tracing back procedure of dynamic programming. 

%The total time (Procedures 1 and 2) to check a given configuration can be formed as a PSNE is 
%$\mathcal{O}(nkp^2m^2 + nkmp(w_{\max})^{km})$. 
%Thus, to verify all configurations, 
%The total time is $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$.  %(details in the Appendix).%, which is polynomial in $n$, $p$, and $w_{\max}$, when $m$ and $k$ are constants. Details are in the Appendix.
\iffalse
If there is such a strategy profile for some configuration, then the game has a PSNE. 
Otherwise, the game does not have any PSNE. 
The reason is that each PSNE must correspond to some configuration, and we explore each configuration to search for a PSNE. 
For a given configuration that corresponds to a PSNE, each player $i$'s equilibrium strategy must be in $BR_i$ because $BR_i$ contains all strategies in which player $i$ does not have any incentive to deviate to other strategies from the configuration. 
We note that there are some configurations that might not be feasible (e.g., some $\mathbf{y}_r$ that are too small or $\mathbf{y}_r + \mathbf{d}_i$ outside of the $w_{\max}$). The above procedure would eliminate them when searching for a PSNE, thereby removing configurations that are not consistent with any strategy profiles. 
\fi

\begin{paperonly}
To see the significance of the above result, note that we can use well-known algorithms to solve the dual CSP. For instance, backtracking algorithms with graph-based learning can solve the dual CSP in $\mathcal{O}\left((n+1)^2 \cdot \left(2 \cdot p^n \cdot w_{\max}^{km}\right)^{n+1}\right)$ time, which is exponential in $n$ \citep{dechter2003constraint}[Ch 6]. In contrast, our algorithm guarantees an exponential saving.
\end{paperonly}

\subsection*{Variant: Binary Demand Vectors}
%\subsection{Variant: Binary Demand Vectors}
%We next turn to a special case of $k$-DCG where the demands are $k$-dimensional binary vectors. 
In Section~\ref{sec:complexity}, we showed that $k$-DCGs with binary demand vectors are provably hard. We can still apply Theorem~\ref{thm:exact_enum} to derive a pseudopolynomial time algorithm when $k$ and $m$ are bounded. However, an improved analysis gives us the following result. Note that in the case of binary demand vectors, any $j$-th element of an aggregated demand vector corresponds to the number of players having the $j$-th bit of their demand vector ``on.'' Therefore, for clarity, we use $\check{n} = \max_{j \in [k]} \sum_{i \in N} d_{i_j}$ in place of $w_{\max}$ to denote the maximum number of players having a demand vector bit on.
%$\max_{1 \le j \le k} |\{ i \ |\ d_{i_j} = 1 \}|$.
%$\max_{1 \le j \le k} $ \{\# of players $i$ with $d_{i_j} = 1$\}. 
The following result is particularly interesting when $\check{n} \ll n$.

\begin{theorem}
For $k$-DCGs with binary demand, 
there is an $\mathcal{O}(\check{n}^{km}(nkp^2m^2 + \min \{ nkmp\check{n}^{km}, n^{km+1} p \}))$-time algorithm to compute a PSNE or decide none exists. The algorithm is polynomial in $n$ and $p$ when $m$ and $k$ are constants. 
\mylabel{thm:exact_enum_binary}
\end{theorem}

\begin{paperonly}
\begin{proofsketch}
    %Due to the binary demand vectors, an element of the $k$-dimensional aggregated demand vector can be at most $n$. 
    Putting $w_{\max} = \check{n}$ in Theorem~\ref{thm:exact_enum}, the running time is $\mathcal{O}(\check{n}^{km}(nkp^2m^2 + nkmp \check{n}^{km}))$. However, using a different analysis that exploits the bit-vector structure, we can shave off a factor of $km$ from the second term at the expense of  having $n^{km}$ instead of $\check{n}^{km}$. This would be useful when $\check{n} \approx n$. The main idea is when we consider player $i$ in Procedure 2, the number of configurations for $T_i$ is at most $(i+1)^{km}$, leading to 
    %. This leads to a running time of 
    $\mathcal{O}\left( \sum_{i=1}^{n} \left[ (i+1)^{km} + k p m i^{km} \right] \right)$ or $\mathcal{O}(n^{km+1} p )$ time for Procedure 2. %(details in the Appendix).
\end{proofsketch}
\end{paperonly}

\begin{appendixonly}
\begin{proof}
    Due to the binary demand vectors, an element of the $k$-dimensional aggregated demand vector can be at most $n$. Putting $w_{\max} = \widetilde{n}$, as a corollary of Theorem~\ref{thm:exact_enum}, the running time of the algorithm is $\mathcal{O}(\widetilde{n}^{km}(nkp^2m^2 + nkmp\widetilde{n}^{km}))$. However, using the structure of the game, we can shave off a factor of $km$ from the second term, as shown below.

    Here, we focus on the running time of Procedure 2. Recall that we are given a configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$ and $BR_i (\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$ for each player $i$. We want to pick a strategy from $BR_i$ for each player $i$ so that the aggregated demand of the picked strategy profile is exactly $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$. 
    As usual, we start with an all zero configuration and define $T_0(\mathbf{0}, ..., \mathbf{0}) = 1$. We then go over the players $1, ..., n$, one at a time. Observe that when we consider player $1$, the number of configurations (or table entries) for $T_1$ is at most $2^{km}$ because the elements of the $k$-dimensional vector for each of the $m$ resources are either $0$ or $1$. In this fashion, when we consider player $i$, the number of configurations for $T_i$ is at most $(i+1)^{km}$. We initialize $T_i$ to $0$ for these configurations in $\mathcal{O}((i+1)^{km})$ time. We set $T_{i}(\mathbf{y}'_1, \mathbf{y}'_2, ..., \mathbf{y}'_m) = 1$ if and only if there is $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m$ such that 
    $T_{i-1}(\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m) = 1$ and, for some $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, 
    $\mathbf{y}'_r = \overline{\mathbf{y}}_r + \mathbbm{1}[r \in s_i] \mathbf{d}_i$ for each $r$. Note that there are $i^{km}$ possibilities of $\overline{\mathbf{y}}_1, \overline{\mathbf{y}}_2, ..., \overline{\mathbf{y}}_m$ from player $i-1$. 
    
    Therefore, the running time of Procedure 2 is $\mathcal{O}\Big( \sum_{i=1}^{n} \big[ (i+1)^{km} + k p m i^{km} \big] \Big)$, which is dominated by $\mathcal{O}\big( \sum_{i=1}^{n} k p m i^{km} \big) = \mathcal{O}\big( k p m \sum_{i=1}^{n}  i^{km} \big)$. Here, $\sum_{i=1}^{n}  i^{km}$ is $\mathcal{O}\big( \frac{n^{km+1}}{km + 1}\big)$. As a result, the running time of Procedure 2 is $\mathcal{O}(n^{km+1} p )$.

    The running time of Procedure 1 is unchanged from Theorem~\ref{thm:exact_enum}. Since both Procedures 1 and 2 are run for each of the $\widetilde{n}^{km}$ configurations, the total running time is $\mathcal{O}(\widetilde{n}^{km}(nkp^2m^2 + \min \{ nkmp\widetilde{n}^{km}, n^{km+1} p \}))$.
\end{proof}
\end{appendixonly}
    

\subsection*{Variant: $k$-Class Congestion Game ($k$-CCG)}
%\subsection{Variant: $k$-Class Congestion Game ($k$-CCG)}
%$k$-CCG is a special case of $k$-DCG where each player's demand vector has exactly one positive element, the rest being $0$. 
Let the \emph{class of player} $i$ be the index where the positive element appears in $\mathbf{d}_i$. 
Although Theorem~\ref{thm:exact_enum} can be directly applied to this case, we can exploit the structure of the game to improve the running time. The key intuition is that the players can be partitioned according to their classes. The players in a class $j \in [k]$ can only affect the $j$-th index of the aggregated demand on any resource. That is, they affect the $j$-th index of each of $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$. As a result, Procedure 2 can be broken into $k$ different computational tasks, each corresponding to a class. This idea leads us to the following result. \emph{Notably, compared to Theorem~\ref{thm:exact_enum}, this partition-based algorithm removes a $k$ term from the exponent.}

\begin{theorem}
For $k$-CCGs, 
there is an $\mathcal{O}((w_{\max})^{km}(np^2m^2 + n k p m (w_{\max})^{m}))$ algorithm to compute a PSNE or decide none exists. The algorithm is polynomial in $n$, $p$, and $w_{\max}$ when $m$ and $k$ are constants. 
\mylabel{thm:exact_enum_kclass}
\end{theorem}

\begin{paperonly}
\begin{proofsketch}
%    When we apply Procedure 1 to $k$-CCG, computing Equation (\ref{eq1}), Equation (\ref{eq2}), and $BR_i$ for each player $i$ takes $\mathcal{O}(m)$, $\mathcal{O}(m^2)$, and $\mathcal{O}(p^2m^2)$, respectively. The saving of a factor of $k$ compared to Theorem~\ref{thm:exact_enum} is due to the addition in Equation (\ref{eq2}) being basically one-dimensional as opposed to $k$-dimensional.
%Thus, 
%Procedure 1 runs in $\mathcal{O}(np^2m^2)$ time. 
%
%We now focus on Procedure 2. 
As a preprocessing step, we partition the players into $C_1, ..., C_k$ based on their classes. %Let the maximum partition size be $q = \max_{j \in [k]} |C_j|$. 
%The players in $C_j$ can only affect the $j$-th element of the aggregated demand on any resource. %Recall that in Procedure 2, we are given a configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$. For each class $j \in [k]$, we construct a vector $\mathbf{y}^j = [y_{1j}, y_{2j}, ..., y_{mj}]$ to be used by the players in $C_j$. 
We now do the following operations in each partition $C_j$ independently.
We start the DP by ordering the players in $C_j$ as $1, 2, ..., |C_j|$ (wlog). We then create a binary table $T_i(z_{1}, z_{2}, ..., z_{m}) \in \{0,1\}$ for each $z_{1}, z_{2}, ..., z_{m} \in  \{0, ..., w_{\max}\}$ of size $\mathcal{O}((w_{\max})^{m})$ for each player $i$ in $C_j$. We initialize $T_0({0}, ..., {0}) = 1$. %$\mathbf{0}$ is a vector of all zero configuration. 
We then define $T_i(z_{1}, z_{2}, ..., z_{m}) = 1$ if and only if there is $z'_{1}, z'_{2}, ..., z'_{m}$ such that 
$T_{i-1}(z'_{1}, z'_{2}, ..., z'_{m}) = 1$ and for some $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, 
$z_r = z'_r + \mathbbm{1}[r \in s_i] {d}_{ij}$ for each $r \in R$. 
%
%After the table construction in all partitions, 
We have a PSNE if and only if for each partition $C_j$, $T_{|C_j|}(y_{1j}, y_{2j}, ..., y_{mj}) = 1$. 
\end{proofsketch}
\end{paperonly}

\begin{appendixonly}
\begin{proof}
When we apply Procedure 1 to $k$-CCG, computing $\pi_i (s_i, \mathbf{y})$, $\pi_i (s_i, \mathbf{y}, s_i')$, and $BR_i$ (using the three equations in Procedure 1) for each player $i$ takes $\mathcal{O}(m)$, $\mathcal{O}(m^2)$, and $\mathcal{O}(p^2m^2)$, respectively. The saving of a factor of $k$ compared to Theorem~\ref{thm:exact_enum} is due to the addition in the second equation for $\pi_i (s_i, \mathbf{y}, s_i')$ being basically one-dimensional as opposed to $k$-dimensional.
Thus, Procedure 1 runs in $\mathcal{O}(np^2m^2)$ time. 

We now focus on Procedure 2. As a preprocessing step, we partition the players into $C_1, ..., C_k$ based on their classes. %Let the maximum partition size be $q = \max_{j \in [k]} |C_j|$. 
The players in $C_j$ can only affect the $j$-th element of the aggregated demand on any resource. Recall that in Procedure 2, we are given a configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$. For each class $j \in [k]$, we construct a vector $\mathbf{y}^j = [y_{1j}, y_{2j}, ..., y_{mj}]$ to be used by the players in $C_j$. We now do the following operations in each partition $C_j$ independently.

We start the DP by ordering the players in $C_j$ as $1, 2, ..., |C_j|$ (wlog). We then create a binary table $T_i(z_{1}, z_{2}, ..., z_{m}) \in \{0,1\}$ for each $z_{1}, z_{2}, ..., z_{m} \in  \{0, ..., w_{\max}\}$ of size $\mathcal{O}((w_{\max})^{m})$ for each player $i$ in $C_j$. We initialize $T_0({0}, ..., {0}) = 1$. %$\mathbf{0}$ is a vector of all zero configuration. 
We then define $T_i(z_{1}, z_{2}, ..., z_{m}) = 1$ if and only if there is $z'_{1}, z'_{2}, ..., z'_{m}$ such that 
$T_{i-1}(z'_{1}, z'_{2}, ..., z'_{m}) = 1$ and for some $s_i \in BR_i(\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m)$, 
$z_r = z'_r + \mathbbm{1}[r \in s_i] {d}_{ij}$ for each $r \in R$. 

Once we finish the table construction in all partitions, we have a PSNE if and only if for each partition $C_j$, $T_{|C_j|}(\mathbf{y}^j) = 1$. The argument is similar to the proof of Theorem~\ref{thm:exact_enum}, only that we have to collate $m$-dimensional vectors $\mathbf{y}^j$ for each class $j \in [k]$ to form the given configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$.

Here, the running time of Procedure 2 is $\mathcal{O} (n k p m (w_{\max})^{m} )$. This is because in each of the $k$ partitions, we do $\mathcal{O} (n p m (w_{\max})^{m} )$ work due to at most $n$ players in that partition, at most $p$ best-response strategies for each player, $m$ resources, and $(w_{\max})^{m}$ table entries. 

Since both Procedures 1 and 2 are performed for each of the $(w_{\max})^{km}$ configurations, the total running time of the algorithm is $\mathcal{O}((w_{\max})^{km}(np^2m^2 + n k p m (w_{\max})^{m}))$. Note that the running time has an exponential saving of $k$ in the second term compared to Theorem~\ref{thm:exact_enum}. 
\end{proof}
\end{appendixonly}

\begin{paperonly}
We next consider the special case of $k$-CCGs with binary demand vectors (i.e., exactly one bit is ``on'' in each player's demand vector). This will be useful when we consider player types next. We get the following corollary from Theorems~\ref{thm:exact_enum_binary} and \ref{thm:exact_enum_kclass}.
% For simplicity, we omit the alternative analysis given in the proof of Theorem~\ref{thm:exact_enum_binary}.
Once again, the result is interesting when $\check{n} \ll n$.
\end{paperonly}

\begin{corollary}
For $k$-CCGs with binary demand vectors, 
there is an $\mathcal{O}((\check{n})^{km}(np^2m^2 + n k p m (\check{n})^{m}))$-time algorithm to compute a PSNE or decide none exists. The algorithm is polynomial in $n$ and $p$ when $m$ and $k$ are constants. 
\mylabel{cor:exact_enum_unit}
\end{corollary}

\begin{appendixonly}
\begin{proof}
The argument is similar to the proof of Theorem~\ref{thm:exact_enum}, only that we have to collate $m$-dimensional vectors $\mathbf{y}^j$ for each class $j \in [k]$ to form the given configuration $\mathbf{y}_1, \mathbf{y}_2, ..., \mathbf{y}_m$.

Here, the running time of Procedure 2 is $\mathcal{O} (n k p m (w_{\max})^{m} )$. This is because in each of the $k$ partitions, we do $\mathcal{O} (n p m (w_{\max})^{m} )$ work due to at most $n$ players in that partition, at most $p$ best-response strategies for each player, $m$ resources, and $(w_{\max})^{m}$ table entries. 

Since both Procedures 1 and 2 are performed for each of the $(w_{\max})^{km}$ configurations, the total running time of the algorithm is $\mathcal{O}((w_{\max})^{km}(np^2m^2 + n k p m (w_{\max})^{m}))$. Note that the running time has an exponential saving of $k$ in the second term compared to Theorem~\ref{thm:exact_enum}.
\end{proof}
\end{appendixonly}

\begin{paperonly}
\subsection*{Variant: $k$-DCG with Player Types}
%\subsection{Variant: $k$-DCG with Player Types}
To motivate this variant, consider %a congestion game in 
a road-traffic setting. There are different types of vehicles, and vehicles of the same type share similarities in their demand vectors. We define  players to be of the same type if their demand vectors are the same. Although this setting is very natural, to our knowledge, it has not been fully explored in the literature. 
Here, other than player types, we do not make any assumptions about the demands  or cost functions. %Given a $k$-DCG instance, we can partition the players into types in $\mathcal{O}(\tau n k)$ time, where $\tau$ is the number of types. 
While this variant is NP-hard (reduction from $k$-DCG by making a type for each player), 
the following result 
%shows that our configuration-space framework 
is very appealing when the maximum number of players of any type $\check{n} \ll n$.
\end{paperonly}

\begin{paperonly}
\begin{figure*}
\begin{minipage}[c]{0.48\linewidth}
\includegraphics[width=0.9\linewidth]{charts/asymptoticm4k2.pdf}
\caption{Running-time comparison among table-based DP (TDP), set-based DP (SDP), and table-based DP asymptotic (TDPA). 
%This is extremely encouraging, 
Encouragingly, even at small scales, TDPA hugely overestimates the actual running time. Here, $m = 4$ and $k = 2$.}
\mylabel{fig:asymp}
\end{minipage}
\hfill
\begin{minipage}[c]{0.48\linewidth}
\includegraphics[width=0.9\linewidth]{charts/regularm2k3.pdf}
\caption{Running-time comparison among brute force (BF), table-based DP (TDP), and set-based DP (SDP). Even at small scales, brute force does not finish within the allocated time when $n > 20$. SDP is the fastest. Here, $m = 2$ and $k = 3$.}
\mylabel{fig:bf}
\end{minipage}%
\end{figure*}
\end{paperonly}

\begin{paperonly}
\begin{theorem}
Given a $k$-DCG with $\tau$ types of players and at most $\check{n}$ players of any type, there is an $\mathcal{O}((\check{n})^{\tau m}(np^2m^2 + n \tau p m (\check{n})^{m}) + \tau n k)$ time algorithm to compute a PSNE or decide that there exists none. The algorithm is polynomial in $n$ and $p$ for bounded $m$ and $\tau$. 
\mylabel{thm:exact_enum_type}
\end{theorem}
\begin{proof}
    Let $(N, R, \{S_i,$ $\mathbf{d}_i\}_{i \in N}, \{c_r\}_{r \in R}, k)$ be a $k$-DCG instance with $\tau$ types of players. 
    We reduce this instance to a PSNE-equivalent $\tau$-DCG instance $(N, R, \{S_i,$ $\mathbf{\widetilde{d}}_i\}_{i \in N}, \{\widetilde{c}_r\}_{r \in R}, \tau)$ as follows. First, we partition the $k$-DCG players into $\tau$ types 
    %, calculate the number of $k$-DCG players $\#n_t$ of each type $t$, 
    and store the $k$-dimensional demand vector (from $k$-DCG) of any player of type $t$ into $\overline{\mathbf{d}}_t$ (i.e., $\overline{\mathbf{d}}_t = \mathbf{d}_i$ if player $i$ is of type $t$). This takes $\mathcal{O}(\tau n k)$ time. For each player $i$ of type $t$ in $\tau$-DCG, we define a $\tau$-dimensional unit demand vector $\mathbf{\widetilde{d}}_i$ where only the $t$-th element is 1, the rest being 0. Given a $\tau$-dimensional aggregated demand vector $\mathbf{\widetilde{x}}_r(\mathbf{s}) = ( \mathbf{\widetilde{x}}_r(\mathbf{s})_1, ..., \mathbf{\widetilde{x}}_r(\mathbf{s})_\tau )$, where any $t$-th element represents the total number of players of type $t$ using $r$, we define the cost function $\widetilde{c}_r(\mathbf{\widetilde{x}}_r(\mathbf{s})) =  c_r \left(\sum_{t = 1}^\tau (\mathbf{\widetilde{x}}_r(\mathbf{s}))_t\overline{\mathbf{d}}_t\right)$. %Under this definition, 
    Thus, 
    $\widetilde{c}_r(\mathbf{\widetilde{x}}_r(\mathbf{s})) = c_r(\mathbf{x}_r(\mathbf{s}))$, where $\mathbf{x}_r(\mathbf{s})$ is the %$k$-dimensional
    aggregated demand in the $k$-DCG instance under $\mathbf{s}$. Therefore, with the PSNE-equivalent $\tau$-DCG being a $\tau$-CCG with binary demands and $\check{n} = \max_{j \in [\tau]} \sum_{i \in N} \widetilde{d}_{i_j}$,
    %\footnote{The definition of $\check{n}$ is mathematically the same as that for binary demand vectors. However, here it means maximum number of players of any type.} 
    Corollary~\ref{cor:exact_enum_unit} gives us the result.
\end{proof}
%
Comparing Theorems \ref{thm:exact_enum_type}  and \ref{thm:exact_enum}, when we have the type information, 
%The demand vectors and cost functions (potentially non-monotonic) are general in both cases. 
Theorem~\ref{thm:exact_enum_type} offers a major saving in running time by replacing $(w_{\max})^{km}$ with $\check{n}^{\tau m}$ in the multiplicative factor as well as  $(w_{\max})^{km}$ with $\check{n}^{m}$ (note the exponential saving of $k$) in the interior expression. These savings are especially pronounced when $\check{n}$  
%(i.e., the maximum number of players of any type) 
is small.

Theorem~\ref{thm:exact_enum_type} can be extended to general $k$-DCGs \emph{without} any player types, in which case $\check{n} = n$. This insight helps us avoid potentially large $w_{\max} \gg n$ in the running time of Theorem~\ref{thm:exact_enum} by using Theorem~\ref{thm:exact_enum_type} instead. Further running time reduction for the case of $\check{n} = n$ is possible through the alternative analysis given in the proof of Theorem~\ref{thm:exact_enum_binary}. \\
%Evidently, $w_{\max}$ can be arbitrarily large, while $\check{n}$ is the maximum number of players of any type.
\end{paperonly}

% \begin{figure}[htp]
%     \centering
%     \includegraphics[width=0.9\columnwidth]{charts/asymptoticm4k2.pdf}
%     \caption{Running-time comparison among table-based DP (TDP), set-based DP (SDP), and asymptotic table-based DP (TDPA). This is extremely encouraging, showing that even at small scales, the asymptotic running time hugely overestimates the practical running time. Here, $m = 4$ and $k = 2$.}
%     \mylabel{fig:asymp}
% \end{figure}

% \begin{figure}[htp]
%     \centering
%     \includegraphics[width=0.9\columnwidth]{charts/regularm2k3.pdf}
%     \caption{Running-time comparison among brute force (BF), table-based DP (TDP), and set-based DP (SDP). Even at small scales, brute force does not finish within the allocated time when $n > 20$. SDP is the fastest. Here, $m = 2$ and $k = 3$.}
%     \mylabel{fig:bf}
% \end{figure}

\subsection*{Experiments}
\begin{paperonly}
We have performed experiments to investigate the practical aspects of the CSP framework for non-monotonic $k$-DCGs with binary demand vectors.
%\footnote{Code, unit tests, and data are in the supplementary material.}
Even with small-scale experiments, we show that the theoretical running time greatly overestimates the practical, worst-case running time. These experiments further show that our CSP framework supports a variety of implementation possibilities. 
%provides many promising directions for implementation.

We have implemented two instantiations of the framework: (1) Table-based DP (TDP), where we use bit vectors to implement the tables, and (2) Set-based DP (SDP), where we use hash-set data structures to represent the tables. In addition, we have implemented the brute-force (BF) algorithm mentioned for the CSP shown in Fig.~\ref{fig:csp}(a). BF is the only prior algorithm known to us for general $k$-DCGs. 

All three algorithms exhaustively search for all PSNE and %are programmed to 
discard a strategy profile as soon as it is clear it cannot lead to a PSNE.
% We also estimate the asymptotic running time of table-based DP by implementing the DP such that it adheres to the asymptotic running time.
We have benchmarked the theoretical running time in the worst case 
by running Procedure 2 on a small table and extrapolating that running time to the table size appearing in Theorem~\ref{thm:exact_enum}. We call this table-based DP asymptotic (TDPA).
%The algorithms were executed on 
We have used non-monotonic $k$-DCGs with $m$ parallel links.
%and varied $n$ for 15 repetitions %
Each parameter-combination was repeated 15 times. See the Appendix for details.

Fig.~\ref{fig:asymp} shows that the asymptotic running time greatly overestimates the actual running time. E.g., for $n = 4$, TDPA is about eight orders of magnitude slower than SDP. 
%The only improvements we made in SDP and TDP over TDPA are using efficient data structures and discarding strategy profiles early. 
Furthermore, Fig.~\ref{fig:bf} shows that SDP and TDP outperform BF easily, even for very small $n$. %Note that all three algorithms are programmed to discard strategy profiles early.
For example, SDP is two orders of magnitude faster than BF for $n = 18$. These signify the practical appeal of our CSP framework against the backdrop of hardness results. 
Most importantly, Procedure 2 opens up a range of possibilities for new CSP-based search algorithms rooted in, for example, backjumping and learning \citep{kumar1992algorithms,dechter2003constraint,van2006backtracking,rossi2008constraint}, backtracking with tree decomposition \citep{jegou2003hybrid}, AND/OR search \citep{marinescu2009and}, etc. We leave a comprehensive experimental study as future work.

\end{paperonly}
\begin{appendixonly}
The algorithm given in theorem \ref{thm:exact_enum} is theoretically efficient under some assumptions. Here, we show that it is practically efficient.
To our knowledge, brute force is the only other algorithm guaranteed to work on games of interest: multi-dimensional congestion games with non-monotonic cost functions.
First, we compare two implementations of our algorithm against brute force.
Our algorithm overtakes brute-force at a relatively small value of $n$.
Second, we compare the implementations against the simulated worst-case complexity of the algorithm: $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$.
This shows that, in practice, our algorithm greatly outperforms its asymptotic behavior.

All algorithms were implemented in Python. 
Source code and data can be found in the supplementary material.
Results were obtained on a Linux machine with an Intel\textregistered\: Xeon\textregistered\: E3-1225 @ 3.1 GHz and 24GB of RAM.

\subsubsection*{Game Generation}
The evaluation was done on a $k$-dimensional parallel link model with $m$ links or resources.  %(Figure \ref{fig:parallel_link}).
Every player chooses one link $r \in R$ from the set of all $m$ links.
Each link $r \in R$ had a non-monotonic cost function of $c_r(\mathbf{x}_r(\textbf{s})) = \alpha_r f_r(\mathbf{x}_r(\textbf{s})) + \beta_r$.
Where $\alpha_r$ and $\beta_r$ are integers drawn uniformly randomly from the range $[0, 100]$.
The non-monotonic component is $f_r(\mathbf{x}_r(\textbf{s})) = f_r^1(\mathbf{x}_r^1(\textbf{s})) + f_r^2(\mathbf{x}_r^2(\textbf{s})) + \cdots + f_r^k(\mathbf{x}_r^k(\textbf{s}))$, where $\mathbf{x}_r^j(\textbf{s})$ is the aggregate demand in the $j$th dimension and $f_r^j$ is the cost of the aggregate demand in the $j$th dimension.
The cost of $f_r^j$ for any given input is an integer drawn uniformly randomly from the range $[0, 100]$.
Every element of the demand vector $d_{ij}$ was an integer drawn uniformly randomly from the range $[0, q]$.
If every element of the demand vector was 0 then the entire demand vector was discarded and randomly generated again.
For each combination of parameters ($m, k, q$), 15 games were randomly generated and then $n$ players were randomly generated, all using the master seed 2024.
%For example, the 3rd game where $m = 4, k = 2, q = 5, n = 8$ is identical to the 3rd game where $m = 4, k = 2, q = 5, n = 9$ except for the 9th player.

\subsubsection*{Methods}
The dynamic program was implemented in two ways.
The first method is as described in section \ref{sec:general}.
The second method exploits the sparsity of 1's in the binary table, by replacing the binary table with a hashset.
Both implementations contain the optimization where if a single player is found to have no best response for a configuration (Procedure 1, section \ref{sec:general}) then the algorithm will stop computations on that configuration.
Likewise the brute force implementation has the optimization where as soon as a single player is found who is willing to deviate from a strategy profile then computations for that strategy profile will stop.
For each $n$ the time to enumerate all configurations or strategy profiles (respectively) was measured and averaged across each of the 15 games.
The binary table dynamic program was also constrained by memory, so that if a level of the binary table consumed more than 1 GB of memory for a single game then execution for that parameter combination would be halted.


In order to chart the asymptotic behavior of the binary table dynamic program we had to ensure that it ran at its big-O speed not faster.
First, all mentioned optimizations were removed.
Second, because the asymptotic behavior of the algorithm in theorem \ref{thm:exact_enum} is based on the size of the binary table, all bits of the binary table were set to 1.
To approximate the speed of the asymptotic binary table algorithm at a large $n$ the average time to check if a configuration contains a NE was measured separately for both procedure 1 $z_1$ and procedure 2 $z_2$.
This was done because of memory and time constraints related to binary table size, which only affected procedure 2.
The binary table size was forced to 1000 for each $n$.
The average time was multiplied by $(nq)^{km} (z_1 + \frac{z_2 (nq + 1)^{km}}{1000})$ to approximate the asymptotic runtime.
\end{appendixonly}


\section{Learning Dynamics Approach}
\mylabel{sec:learning}
\begin{paperonly}
The second class of algorithms we present is grounded in learning dynamics, which often presents a natural way of studying how players arrive at an equilibrium point \citep{fudenberg1998theory}. As such, learning dynamics is prominently featured in a wide range of areas from evolutionary game theory \citep{weibull1997evolutionary}, to wireless network \citep{lasaulce2011game}, to our topic of congestion games \citep{shah2010dynamics}. 
In general, learning algorithms may not converge, which leads us to two threads. 

First, we consider linear and exponential cost functions with convergence guarantees \citep{klimm_equilibria_2022}. We derive explicit running times for $k$-DCGs and their variants for these cost functions. Second, we consider the general (potentially non-monotonic) cost functions with no convergence guarantees. We present approximation algorithms for this general case.

\end{paperonly}

\begin{appendixonly}
\begin{appendixdefinition}[$\mathbf{w}$-potential game \citep{potential_games}]
Given a vector of positive numbers $\mathbf{w} = (w_i)_{i \in N} > 0$, a game is called a $\mathbf{w}$-potential game if it admits function $P: S \rightarrow \mathbb{R}$ such that for every $i \in N$ and for every $\mathbf{s}_{-i}$, the following holds for any $s_i$ and $s_i'$.
$$
\pi_i(s_i, s_{-i}) - \pi_i(s'_i, s_{-i}) = w_i \cdot (P(s_i, s_{-i}) - P(s'_i, s_{-i}) ).
$$
Here, $P$ is called a $\mathbf{w}$-potential function. 
\end{appendixdefinition}
\end{appendixonly}


\subsection*{Linear Cost Functions}
\begin{paperonly}
We study an iterative {best-response algorithm}, where players start with an arbitrary strategy profile and iteratively play best responses until convergence to a PSNE, for $k$-DCGs and their variants using 
potential functions. %This algorithm starts with an arbitrary strategy profile. As long as some player can improve their cost, their best response is updated. 
\end{paperonly}
Let the linear cost function of any resource $r$ under a strategy profile $\mathbf{s}$ be 
$c_r(\mathbf{x}_r(\mathbf{s})) \equiv a_r \sum_{j \in [k]} {z}_{j} \mathbf{x}_{r,j}(\mathbf{s}) + b_r = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$, 
where $a_r, b_r \ge 0$ $\forall r$ and the $k$-dimensional vector $\mathbf{z} \ge 0$. 

\begin{paperonly}
    We have the following results on $k$-DCGs and their variants.
    Notably, \cite{klimm_equilibria_2022} provide an alternative proof of the existence of a potential function for this class of congestion games via the isomorphism technique. However, their proof is focused on existence and leaves open computational questions, especially for variants of $k$-DCGs, which we address here. 
    The complete proofs are in the Appendix.
\end{paperonly}

\begin{appendixonly}
Basically, resource $r$'s cost function is a weighted sum of the $k$ elements of the aggregated demand vector $\mathbf{x}_r(\mathbf{s})$ with resource-specific multiplicative and additive terms $a_r$ and $b_r$, respectively.

Let the linear cost function of any resource $r$ under a strategy profile $\mathbf{s}$ be 
$c_r(\mathbf{x}_r(\mathbf{s})) \equiv a_r \sum_{j \in [k]} {z}_{j} \mathbf{x}_{r,j}(\mathbf{s}) + b_r = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$,
%xxxssx add for journal
%= a_r \mathbf{z}^T \mathbf{x}_r(\mathbf{s}) + b_r$, 
%$c_r(x_r(\mathbf{s})) = \sum_{j \in [k]} a_{r,j} x_{r,j}(\mathbf{s}) + b_r$, 
where $a_r, b_r \ge 0$ for all $r$ and the $k$-dimensional vector $\mathbf{z} \ge 0$. Basically, resource $r$'s cost function is a weighted sum of the $k$ elements of the aggregated demand vector $\mathbf{x}_r(\mathbf{s})$ with resource-specific multiplicative and additive terms $a_r$ and $b_r$, respectively. %Among the three equivalent definitions of a linear cost function, 
%xxxssx add for journal
%We use the vector dot product notation with added brackets (e.g., $[\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ]$) for the clarity of presentation.

We study an iterative \emph{best-response algorithm}, where players iteratively play best responses until convergence to a PSNE, for several variants of $k$-DCGs based on bounding a potential function. This algorithm starts with an arbitrary strategy profile. As long as some player can improve their cost, their best response is updated. 

The next theorem presents a potential function for linear cost. Notably, \cite{klimm_equilibria_2022} provide an alternative proof of this theorem via the isomorphism technique, but their proof is focused on existence and leaves open computational questions, which we address here. 
%That is, does not readily provide the potential function needed for designing and analyzing algorithms, especially for variants of $k$-DCGs.
%, for which it is necessary to connect to the one-dimensional case \citep{harks2011characterizing}. 
%Furthermore, we generalize the potential function for one-dimensional weighted congestion games \citep{fotakis_selfish_2005,panagopoulou_algorithms_2007}.  


\begin{appendixtheorem}% \citep{klimm_equilibria_2022}
    Any multidimensional congestion game with linear resource costs is a $\mathbf{w}$-potential game.
\mylabel{thm:linear_mdcg_potential}
\end{appendixtheorem}

% Proof going to appendix.
% \begin{proofsketch}
% The proof is very long and can be found in the Appendix. 
% We show that $\Phi(\mathbf{s})$ is a $\mathbf{w}$-potential function for $w_i = \frac{1}{2 [\mathbf{z} \cdot \mathbf{d}_i]}$ for all $i$ where 
% $\Phi(\mathbf{s}) = \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})) [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + \sum_{i \in N} \sum_{r \in s_i} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i].$
% \end{proofsketch}

\begin{proof}
    The proof follows the same line of argument as the single dimensional case \citep{fotakis_selfish_2005}. Here, the main task is to devise a potential function when the demands are vectors instead of scalars. %We perform a dot product to mitigate this.

\stepcounter{equation}
\stepcounter{equation}
We show that the $\Phi(\mathbf{s})$ defined below is a $\mathbf{w}$-potential function for the choice of $w_i = \frac{1}{2 [\mathbf{z} \cdot \mathbf{d}_i]}$ for each $i$.
\begin{align}
&\Phi_1(\mathbf{s}) = \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})) [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})].\nonumber \\ 
&\Phi_2(\mathbf{s}) = \sum_{i \in N} \sum_{r \in s_i} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i].\nonumber \\ 
&\Phi(\mathbf{s}) = \Phi_1(\mathbf{s}) + \Phi_2(\mathbf{s}). \label{eq:potential_func}
\end{align}

Consider any set of resources $s_i' \neq s_i$. Define $\mathbf{s}' = (s_{-i}, s_i')$. For any resource $r$ that is picked either by both $s_i$ and $s_i'$ or none of them: 
\begin{flalign}
& c_r(\mathbf{x}_r(\mathbf{s})) = c_r(\mathbf{x}_r(\mathbf{s'})). \label{eqn:cr_equal_linear}\\
& c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) = c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}).
\end{flalign}
%
For any $r \in s_i \setminus s_i'$: 
\begin{align*}
& c_r(\mathbf{x}_r(\mathbf{s})) - c_r(\mathbf{x}_r(\mathbf{s'})) & \\
&= a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r - a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})] - b_r &\\
            & = a_r [\mathbf{z} \cdot (\mathbf{x}_r(\mathbf{s}) - \mathbf{x}_r(\mathbf{s'}))] &\\
            & = a_r [\mathbf{z} \cdot \mathbf{d}_i] \text{, and}&
\end{align*}
%
%
\begin{flalign*}
& c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) -  c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}) & \nonumber \\
    & = (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r) \mathbf{x}_r(\mathbf{s}) -  (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})] + b_r) \mathbf{x}_r(\mathbf{s'}) & \nonumber \\
    & = (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r) \mathbf{x}_r(\mathbf{s})  &\\
        & \quad - (a_r [\mathbf{z} \cdot ( \mathbf{x}_r(\mathbf{s}) - \mathbf{d}_i)] + b_r) ( \mathbf{x}_r(\mathbf{s}) - \mathbf{d}_i) & \nonumber \\
    & = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] \mathbf{x}_r(\mathbf{s}) + b_r \mathbf{x}_r(\mathbf{s}) &\\
    & \quad -  a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{x}_r(\mathbf{s}) + a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s}) - b_r \mathbf{x}_r(\mathbf{s}) &\\
    & \quad + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i - a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{d}_i + b_r \mathbf{d}_i. &\\
    %
    & = a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s})  + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i - a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{d}_i + b_r \mathbf{d}_i 
\end{flalign*}
%
Similarly, for any resource $r \in s_i' \setminus s_i$, 
\begin{flalign*}
    &c_r(\mathbf{x}_r(\mathbf{s})) - c_r(\mathbf{x}_r(\mathbf{s'})) =  
    %-c_r(\mathbf{d}_i) &\\
    %        & = -\left(\sum_{j \in [k]} a_{r,j} \mathbf{d}_{i,j} + b_r\right) 
    -a_r [\mathbf{z} \cdot \mathbf{d}_i] \text{, and} &
\end{flalign*}
%
\begin{flalign*}
& c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) -  c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}) & \nonumber \\
    & = (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r) \mathbf{x}_r(\mathbf{s}) -  (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})] + b_r) \mathbf{x}_r(\mathbf{s'}) & \nonumber \\
    & = (a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r) \mathbf{x}_r(\mathbf{s}) &\\
    & \quad - (a_r [\mathbf{z} \cdot ( \mathbf{x}_r(\mathbf{s}) + \mathbf{d}_i)] + b_r) ( \mathbf{x}_r(\mathbf{s}) + \mathbf{d}_i) & \nonumber \\
    & = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] \mathbf{x}_r(\mathbf{s}) + b_r \mathbf{x}_r(\mathbf{s}) &\\
    & \quad -  a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{x}_r(\mathbf{s}) - a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s}) - b_r \mathbf{x}_r(\mathbf{s}) &\\
    & \quad - a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i - a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{d}_i - b_r \mathbf{d}_i &\\
    %
    & = -a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s})  - a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i - a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{d}_i - b_r \mathbf{d}_i.
\end{flalign*}
%
%Using Eqn~\ref{eq:potential_subfunc}, 
%Using Eqn~\ref{eq:cr_diff} and \ref{eq:cr_diff_rev}, 
The difference in the $\Phi_1$ function under $\mathbf{s}$ and $\mathbf{s}'$ is
\begin{flalign*}
    &\Phi_1(\mathbf{s}) - \Phi_1(\mathbf{s}') &\\
    & = \sum_{r \in R} \Big( c_r(\mathbf{x}_r(\mathbf{s})) [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] -  c_r(\mathbf{x}_r(\mathbf{s'})) [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})] \Big) &\\
            & = \mathbf{z} \cdot \sum_{r \in R} \Big( c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) -  c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}) \Big) &\\
           & = \mathbf{z} \cdot \sum_{r \in s_i \setminus s_i'}  \Big( c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) -  c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}) \Big) + &\\
           & \quad \quad \mathbf{z} \cdot \sum_{r \in s_i' \setminus s_i} \Big( c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) - c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}) \Big) &\\
           %
           & = \mathbf{z} \cdot \sum_{r \in s_i \setminus s_i'} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s})  + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i - &\\
           %& \quad \quad \quad \quad 
           & \quad \quad \quad \quad a_r [\mathbf{z} \cdot \mathbf{d}_i]  \mathbf{d}_i + b_r  \mathbf{d}_i \Big) - &\\
           & \quad \quad \mathbf{z} \cdot \sum_{r \in s_i' \setminus s_i} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{x}_r(\mathbf{s})  + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] \mathbf{d}_i + &\\
           %& \quad \quad \quad \quad 
           & \quad \quad \quad \quad a_r [\mathbf{z} \cdot \mathbf{d}_i] \mathbf{d}_i + b_r  \mathbf{d}_i \Big) &\\
           %
           & = \sum_{r \in s_i \setminus s_i'}\Big( a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})]  + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] [\mathbf{z} \cdot \mathbf{d}_i] - &\\
           & \quad \quad \quad \quad a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{d}_i] + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big) - &\\
           & \quad \sum_{r \in s_i' \setminus s_i} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})]  + a_r [\mathbf{z} \cdot  \mathbf{x}_r(\mathbf{s}) ] [\mathbf{z} \cdot \mathbf{d}_i] + &\\
           & \quad \quad \quad \quad a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{d}_i] + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big). &\\
           %
           & = \sum_{r \in s_i \setminus s_i'}\Big( 2 a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] - a_r [\mathbf{z} \cdot \mathbf{d}_i]^2 + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big) - &\\
           & \quad \sum_{r \in s_i' \setminus s_i} \Big( 2 a_r  [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})]  + a_r [\mathbf{z} \cdot \mathbf{d}_i]^2 + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big). &       
\end{flalign*}
%
The difference in the $\Phi_2$ function under $\mathbf{s}$ and $\mathbf{s}'$ is
\begin{align*}
    &\Phi_2(\mathbf{s}) - \Phi_2(\mathbf{s}') &\\
    & = \sum_{l \in N} \sum_{r \in s_l} c_r(\mathbf{d}_l) [\mathbf{z} \cdot \mathbf{d}_l] - \sum_{l \in N} \sum_{r \in s_l'} c_r(\mathbf{d}_l) [\mathbf{z} \cdot \mathbf{d}_l] &\\
            & = \sum_{r \in s_i} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i] - \sum_{r \in s_i'} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i] &\\
            & \quad \quad \quad \text{ [because only $i$'s strategy changed between $\mathbf{s}$ and $\mathbf{s}'$]} &\\
            & = \sum_{r \in s_i \setminus s_i'} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i] - \sum_{r \in s_i' \setminus s_i} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i]. &\\
            & = \sum_{r \in s_i \setminus s_i'}  
            \Big(a_r [\mathbf{z} \cdot \mathbf{d}_i] + b_r \Big) [\mathbf{z} \cdot \mathbf{d}_i] &\\
            & \quad -  \sum_{r \in s_i' \setminus s_i} \Big( 
            a_r [\mathbf{z} \cdot \mathbf{d}_i] + b_r \Big) [\mathbf{z} \cdot \mathbf{d}_i] &\\
            & = \sum_{r \in s_i \setminus s_i'} \Big( a_r 
            [\mathbf{z} \cdot \mathbf{d}_i]^2 + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big) &\\
            & \quad - \sum_{r \in s_i' \setminus s_i} \Big( 
            a_r [\mathbf{z} \cdot \mathbf{d}_i]^2 + b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big).&
\end{align*}
%
Combining the differences in $\Phi_1$ and $\Phi_2$, following is the difference in the proposed potential function.
\begin{flalign*}
    &\Phi(\mathbf{s}) - \Phi(\mathbf{s}') &\\
    &=  \Phi_1(\mathbf{s}) - \Phi_1(\mathbf{s}') +  \Phi_2(\mathbf{s}) - \Phi_2(\mathbf{s}') &\\
            & = \sum_{r \in s_i \setminus s_i'}\Big( 2 a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + 2 b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big) - &\\
           & \quad \sum_{r \in s_i' \setminus s_i} \Big( 2 a_r [\mathbf{z} \cdot \mathbf{d}_i] [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})]  + 2 a_r [\mathbf{z} \cdot \mathbf{d}_i]^2 + 2 b_r [\mathbf{z} \cdot \mathbf{d}_i] \Big) &\\
           %
           & = \sum_{r \in s_i \setminus s_i'} 2[\mathbf{z} \cdot \mathbf{d}_i] \Big(a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r \Big) - &\\
           & \quad \quad \sum_{r \in s_i' \setminus s_i} 2 [\mathbf{z} \cdot \mathbf{d}_i] \Big(a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + a_r [ \mathbf{z} \cdot \mathbf{d}_i)]  +  b_r \Big) &\\   %
           & = \sum_{r \in s_i \setminus s_i'} 2[\mathbf{z} \cdot \mathbf{d}_i] \Big(a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] + b_r \Big) - &\\
           & \quad \quad \sum_{r \in s_i' \setminus s_i} 2 [\mathbf{z} \cdot \mathbf{d}_i] \Big(a_r \Big[\mathbf{z} \cdot (\mathbf{x}_r(\mathbf{s}) + \mathbf{d}_i) \Big]  +  b_r \Big) &\\   %
           & = \sum_{r \in s_i \setminus s_i'} 2[\mathbf{z} \cdot \mathbf{d}_i] c_r(\mathbf{x}_r(\mathbf{s})) -
           \sum_{r \in s_i' \setminus s_i} 2 [\mathbf{z} \cdot \mathbf{d}_i] c_r(\mathbf{x}_r(\mathbf{s'})) &\\      
           & = 2[\mathbf{z} \cdot \mathbf{d}_i] \Big( \sum_{r \in s_i \setminus s_i'}  c_r(\mathbf{x}_r(\mathbf{s})) -
           \sum_{r \in s_i' \setminus s_i}  c_r(\mathbf{x}_r(\mathbf{s'})) \Big). & 
\end{flalign*}

%
The difference between player $i$'s costs under $\mathbf{s}$ and $\mathbf{s}'$ is
\begin{align*}
\label{eq:utility_diff}
&\pi_i(\mathbf{s}) - \pi_i(\mathbf{s'}) &\\
    & = \sum_{r \in s_i} c_r(\mathbf{x}_r(\mathbf{s})) - \sum_{r \in s_i'} c_r(\mathbf{x}_r(\mathbf{s'}))& \nonumber\\
& = \sum_{r \in s_i \setminus s_i'} c_r(\mathbf{x}_r(\mathbf{s})) - \sum_{r \in s_i' \setminus s_i}  c_r(\mathbf{x}_r(\mathbf{s'})) \text{\ [by Eqn~\ref{eqn:cr_equal_linear}]}  &
%& = 
%& = \sum_{r \in s_i \setminus s_i'} \left( \sum_{j \in [k]} a_{r,j} \mathbf{d}_{i,j} + b_r \right) - \sum_{r \in s_i' \setminus s_i} \left( \sum_{j \in [k]} a_{r,j} \mathbf{d}_{i,j} + b_r \right).\nonumber
\end{align*}
Therefore, for any player $i$ and any strategy profile $\mathbf{s}$ and $\mathbf{s}'$ (as defined above),
\begin{equation}
    \Phi(\mathbf{s}) - \Phi(\mathbf{s}') = 2[\mathbf{z} \cdot \mathbf{d}_i] (\pi_i(\mathbf{s}) - \pi_i(\mathbf{s'})).
\label{eqn:linear_mdcg_potential}
\end{equation}

%Since a $\mathbf{w}$-potential game requires that each $w_i > 0$, 
If $[\mathbf{z} \cdot \mathbf{d}_i] > 0$ for all $i$, then this concludes the proof that multidimensional congestion games with a linear cost function are $\mathbf{w}$-potential games with $w_i = \frac{1}{2 [\mathbf{z} \cdot \mathbf{d}_i]}$ for each $i$. However, if $[\mathbf{z} \cdot \mathbf{d}_i] = 0$ for some $i$, note that player $i$ does not affect the payoff of any other player. This is because $c_r(\mathbf{x}_r(\mathbf{s})) = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r = a_r \big[\mathbf{z} \cdot \big( \mathbf{x}_r(s_{-i}) + \mathbf{d}_i \big) \big] + b_r = a_r [\mathbf{z} \cdot \mathbf{x}_r(s_{-i}) ] + a_r [\mathbf{z} \cdot \mathbf{d}_i ] + b_r = a_r [\mathbf{z} \cdot \mathbf{x}_r(s_{-i}) ] + b_r$. As a result, we can exclude such players $i$ from the game without impacting the other players' choices, and the resulting game is a $\mathbf{w}$-potential game.\footnote{For the purpose of equilibrium computation, the best responses of the excluded player $i$ can be added back later on without impacting the choices of the other players.}
\end{proof}

% Algorithm~\ref{alg:BR} presents a best-response dynamics procedure for multidimensional congestion games and analyze its running time. \cite{panagopoulou_algorithms_2007} call such a procedure \textit{Nashify} in the context of one-dimensional, symmetric network congestion games.  

Below, we formalize the best-response algorithm outlined in the main text.

\setcounter{algocf}{1}

\begin{algorithm}
\setcounter{AlgoLine}{0}
\SetAlgoLined
\SetNlSkip{0.3em}
\KwIn{A multidimensional congestion game}
\KwOut{Pure-strategy Nash equilibrium}
    Choose an arbitrary strategy profile $\mathbf{s}$\\
    \While {some player $i$ can improve their cost} {
        Update $s_i$ with $i$'s best response to $\mathbf{s}_{-i}$
    }
    return $\mathbf{s}$
\caption{Best Response Dynamics}
\label{alg:BR}
\end{algorithm}

%\paragraph{Discussion on Theorem~\ref{thm:linear_mdcg_potential}.} In the literature, we find two different approaches to formulating potential games for various extensions of congestion games. The first and more direct approach is to provide a potential function (e.g., \citep{potential_games,fotakis_selfish_2005,panagopoulou_algorithms_2007}). The second approach is to prove that a certain class of congestion games is isomorphic to a class of congestion games already known to be potential games (e.g., \cite{klimm_equilibria_2022} used this approach for equilibria existence proofs). We follow the first approach in Theorem~\ref{thm:linear_mdcg_potential}. In particular, \cite{panagopoulou_algorithms_2007} uses a $w$-potential game to give an algorithm for weighted, single-dimensional, and symmetric network congestion games where the cost of a resource is defined as the sum of weights on that resource. We extend their approach to multidimensional congestion games with a more general cost function $c_r(\mathbf{x}_r(\mathbf{s})) = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$. 

%xxxxxxx include in journal
\iffalse
Regarding the linear cost function, it is important to note that a slightly more general cost function $c_r(\mathbf{x}_r(\mathbf{s})) = [\mathbf{z}_r \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$ (where the $k$-dimensional resource-specific vector $\mathbf{z}_r$ does not necessarily decompose into $a_r \mathbf{z}$ for all $r$) rules out a guarantee for the existence of equilibria (and hence rules out potential games) \citep{klimm_equilibria_2022}. In fact, it can be shown that $\Phi(\mathbf{s}) - \Phi(\mathbf{s}') = \sum_{r \in s_i \setminus s_i'} 2[\mathbf{z_r} \cdot \mathbf{d}_i] c_r(\mathbf{x}_r(\mathbf{s})) -
           \sum_{r \in s_i' \setminus s_i} 2 [\mathbf{z_r} \cdot \mathbf{d}_i] c_r(\mathbf{x}_r(\mathbf{s}'))$. 
Clearly, the $[\mathbf{z}_r \cdot \mathbf{d}_i]$ term cannot be factored out of the summation, which gives the intuition that this generalized cost function precludes the potential functions considered here.
It should also be noted that although \cite{klimm_equilibria_2022} give us an insight into cost functions, their work focuses on characterizing the existence of equilibria, whereas ours focuses on algorithms. 
\fi 


To analyze the best-response algorithm, 
%Algorithm~\ref{alg:BR}, 
we establish an upper bound on the potential function in Appendix Lemma~\ref{appendix_lem:ub_potential_linear}. 
%(proof in the Appendix). 
%\jared{Reviewer 1, ``Why is the integrality assumption necessary.''}
We make the typical integrality assumption on all $a_r$, $b_r$ and the elements of the vectors $\mathbf{z}$ and $\mathbf{d}_i$ for any player $i$ \citep{fotakis2002structure,fotakis_selfish_2005}. We use the below  notations. The sum of all players' demand vectors is denoted by $\mathbf{d}_N \equiv \sum_{i \in N} \mathbf{d}_i$. 
%, and the sum of all components of $\mathbf{d}_N$ is denoted by $D \equiv \sum_{j \in [k]} (\mathbf{d}_N)_j = \sum_{i \in N} \sum_{j \in [k]} d_{i,j}$. 
In addition, let $A \equiv \sum_{r \in R} a_r$ and $B \equiv \sum_{r \in R} b_r$. 

\begin{appendixlemma}
For any strategy profile $\mathbf{s}$, the potential function $\Phi(\mathbf{s})$ is upper bounded by $2 A [\mathbf{z} \cdot \mathbf{d}_N]^2 + (n+1) B [\mathbf{z} \cdot \mathbf{d}_N]$.
\mylabel{lem:ub_potential_linear}
\end{appendixlemma}

\begin{proof}
We first get the following bounds on the $\Phi_1$ and $\Phi_2$ functions.

\begin{align*}
\Phi_1(\mathbf{s}) & = \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})) [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})] &\\
%
        & \le \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})) [\mathbf{z} \cdot \mathbf{d}_N] &\\
%
        & = [\mathbf{z} \cdot \mathbf{d}_N] \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s}))  &\\
%
        & \le [\mathbf{z} \cdot \mathbf{d}_N] \sum_{r \in R} c_r(\mathbf{d}_N).&
\end{align*}

\begin{align*}
\Phi_2(\mathbf{s}) & = \sum_{i \in N} \sum_{r \in s_i} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i] &\\
%
        & \le \sum_{r \in R} \sum_{i \in N} c_r(\mathbf{d}_i) [\mathbf{z} \cdot \mathbf{d}_i] &\\
%
        & = \sum_{i \in N} [\mathbf{z} \cdot \mathbf{d}_i] \sum_{r \in R} c_r(\mathbf{d}_i) &\\
%
        & \le [\mathbf{z} \cdot \mathbf{d}_N] \sum_{i \in N} \sum_{r \in R} c_r(\mathbf{d}_i) &\\
%
        & = [\mathbf{z} \cdot \mathbf{d}_N] \sum_{i \in N} \sum_{r \in R} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_i] + b_r \Big) &\\
%
        & = [\mathbf{z} \cdot \mathbf{d}_N] \sum_{r \in R} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_N] + n b_r \Big) &\\
%
        & = [\mathbf{z} \cdot \mathbf{d}_N] \Bigg( \sum_{r \in R} \Big( a_r [\mathbf{z} \cdot \mathbf{d}_N] + b_r \Big) + (n-1) \sum_{r \in R}  b_r \Bigg) &\\        
%
        & = [\mathbf{z} \cdot \mathbf{d}_N] \Bigg( \sum_{r \in R} c_r(\mathbf{d}_N) \Bigg) + (n-1) B [\mathbf{z} \cdot \mathbf{d}_N].&        
\end{align*}

Combining the bounds on $\Phi_1$ and $\Phi_2$, we get the following bound on the potential function.

\begin{align*}
\Phi(\mathbf{s}) & \le  2 [\mathbf{z} \cdot \mathbf{d}_N] \Bigg( \sum_{r \in R} c_r(\mathbf{d}_N) \Bigg) + (n-1) B [\mathbf{z} \cdot \mathbf{d}_N] &\\
%
    & =  2 [\mathbf{z} \cdot \mathbf{d}_N] \Bigg( \sum_{r \in R} \Big( a_r[z.\mathbf{d}_N] + b_r \Big) \Bigg) &\\
    &   \quad \quad + (n-1) B [\mathbf{z} \cdot \mathbf{d}_N] &\\
%
    & =  2 [\mathbf{z} \cdot \mathbf{d}_N] \Big( A [z.\mathbf{d}_N] + B \Big) + (n-1) B [\mathbf{z} \cdot \mathbf{d}_N] &\\
%
    & =  2 A [\mathbf{z} \cdot \mathbf{d}_N]^2 + (n+1) B [\mathbf{z} \cdot \mathbf{d}_N]&
\end{align*}
\end{proof}

Next, 
%xxxxxxxx include in the journal
%we use this upper bound on the potential function to analyze Algorithm~\ref{alg:BR} for several cases. We only 
%focus on 
we upper bound the number of iterations. Each iteration runs in $\mathcal{O}\big(nkpm^2\big)$ time, giving us the following theorems.

\begin{appendixtheorem}
    %Algorithm~\ref{alg:BR} 
    The best-response algorithm runs in pseudo-polynomial time.    
\end{appendixtheorem}

% \begin{paperonly}
% \begin{proofsketch}
%     %Using Equation~\ref{eqn:linear_mdcg_potential}, whenever a player $i$ reduces their cost by 1 in Algorithm~\ref{alg:BR}, the potential function decreases by $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2$ due to the integrality of $z$ and $\mathbf{d}_i$. Also, as detailed in the proof of Theorem~\ref{thm:linear_mdcg_potential}, if $[\mathbf{z} \cdot \mathbf{d}_i] = 0$ for some players $i$, those players do not impact the cost of the other players and thereby can be excluded from the game. 
% %As a result, 
% The maximum number of iterations can be shown to be $A [\mathbf{z} \cdot \mathbf{d}_N]^2 + \frac{n+1}{2} B [\mathbf{z} \cdot \mathbf{d}_N]$.
% \end{proofsketch}
% \end{paperonly}

\begin{proof}
Using Equation~\ref{eqn:linear_mdcg_potential}, whenever a player $i$ reduces their cost by 1 in Algorithm~\ref{alg:BR}, the potential function decreases by $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2$ due to the integrality of $z$ and $\mathbf{d}_i$. Also, as detailed in the proof of Appendix Theorem~\ref{appendix_thm:linear_mdcg_potential}, if $[\mathbf{z} \cdot \mathbf{d}_i] = 0$ for some players $i$, those players do not impact the cost of the other players and thereby can be excluded from the game. 
As a result, the maximum number of iterations of the algorithm is $A [\mathbf{z} \cdot \mathbf{d}_N]^2 + \frac{n+1}{2} B [\mathbf{z} \cdot \mathbf{d}_N]$.
\end{proof}

The following theorem gives us the multidimensional counterpart of the single-dimensional result by \cite{panagopoulou_algorithms_2007}.
\end{appendixonly}
%xxxxxx include in the journal
%The following theorem gives us the multidimensional counterpart of the one-dimensional result by \cite{panagopoulou_algorithms_2007}.

\begin{appendixonly}
    \setcounter{theorem}{9}    
\end{appendixonly}

\begin{theorem}
    %Algorithm~\ref{alg:BR} 
    For linear-cost $k$-DCGs, the best-response algorithm runs in polynomial time if \  $\max_r a_r, \max_r b_r$, and % = \mathcal{O}(n^{c_1})$ and 
    $\frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}$ are polynomial in $n$.%\footnote{Polynomial-time algorithms for linear cost  are unlikely to exist due to the PLS-completeness of (unweighted) network congestion games with linear cost \citep{ackermann2008impact}.}
    %=  \mathcal{O}(n^{c_2})$ for some constants $c_1$ and $c_2$.
\mylabel{thm:running_time_alg_BR}
\end{theorem}

% \begin{paperonly}
% \begin{proofsketch}
% Whenever a player $i$ reduces their cost by 1, the decrease in the potential function is $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2 \min_i [\mathbf{z} \cdot \mathbf{d}_i]$. %\footnote{Note that unlike single-dimensional demands, $\min_i \mathbf{d}_i$ is not well defined.}
%  Therefore, the number of iterations is at most $n^2 m (\max_r a_r + \max_r b_r) \frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}$.
% \end{proofsketch}
% \end{paperonly}

\begin{appendixonly}
\begin{proof}
Whenever a player $i$ reduces their cost by 1, the decrease in the potential function is $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2 \min_i [\mathbf{z} \cdot \mathbf{d}_i]$.\footnote{Note that unlike one-dimensional demands, $\min_i \mathbf{d}_i$ is not well defined.}
\begin{align*}
    &\text{Number of iterations } &\\
    &= \frac{2 A [\mathbf{z} \cdot \mathbf{d}_N]^2}{2 \min_i [\mathbf{z} \cdot \mathbf{d}_i]} + \frac{(n+1) B}{2} \frac{[\mathbf{z} \cdot \mathbf{d}_N]}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}&\\
    %
    & \le \frac{A \Big( n^2 \max_i [\mathbf{z} \cdot \mathbf{d}_i]^2 \Big) }{\min_i [\mathbf{z} \cdot \mathbf{d}_i]} + \frac{(n+1) B}{2} \frac{n \max_i [\mathbf{z} \cdot \mathbf{d}_i]}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}&\\
    %
    & \le n^2 (A+B) \frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}&\\
    & \le n^2 m (\max_r a_r + \max_r b_r) \frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}.&
\end{align*}

Therefore, when $\max_r a_r$, $\max_r b_r$, and $\frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]}$ are polynomial in $n$, then  
%= \mathcal{O}(n^{c_1})$ and $\frac{\max_i [\mathbf{z} \cdot \mathbf{d}_i]^2}{\min_i [\mathbf{z} \cdot \mathbf{d}_i]} =  \mathcal{O}(n^{c_2})$ for constants $c_1$ and $c_2$, 
the number of iterations is %$\mathcal{O}(n^{c_1+c_2+2}m)$, 
$\mathcal{O}\big(m \times \text{poly}(n)\big)$, where $n$ is the number of players and $m$ is the number of resources, as defined earlier.
\end{proof}
\end{appendixonly}



%xxxxxx include in journal
%\subsection{Special Case: Binary Demand Vectors}
%We have the following two theorems for 
\begin{appendixonly}
We next consider the cases of \emph{binary demand vectors} and \emph{$k$-CCGs}.
\end{appendixonly}
%For the special case of binary demand vectors, we have the following result.

\begin{theorem}
    For linear-cost $k$-DCGs with binary demand vectors vectors, the best-response algorithm %Algorithm~\ref{alg:BR} 
    runs in polynomial time if the following cost function parameters are polynomial in $n$: $\max_r a_r$, $\max_r b_r$, and $\max_j z_j$.
\end{theorem}

% \begin{paperonly}
% \begin{proofsketch}
% First, note that $\max_i [\mathbf{z} \cdot \mathbf{d}_i] \le \sum_{j \in [k]} z_j \le k \max_{j \in [k]} z_j$. Whenever a player $i$ reduces their cost by 1, the decrease in the potential function is $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2 \min_i [\mathbf{z} \cdot \mathbf{d}_i] \ge 2$. Therefore, the number of iterations is at most $n^2 m (\max_r a_r + \max_r b_r) \big( k \max_j z_j \big)^2$.
% \end{proofsketch}
% \end{paperonly}

\begin{appendixonly}
\begin{proof}
    First, note that $\max_i [\mathbf{z} \cdot \mathbf{d}_i] \le \sum_{j \in [k]} z_j \le k \max_{j \in [k]} z_j$. Whenever a player $i$ reduces their cost by 1, the decrease in the potential function is $2 [\mathbf{z} \cdot \mathbf{d}_i] \ge 2 \min_i [\mathbf{z} \cdot \mathbf{d}_i] \ge 2$.
    
    \begin{align*}
        &\text{Number of iterations } &\\
        &= \frac{2 A [\mathbf{z} \cdot \mathbf{d}_N]^2}{2} + \frac{(n+1) B [\mathbf{z} \cdot \mathbf{d}_N]}{2} &\\
        %
        & \le A n^2 \max_i [\mathbf{z} \cdot \mathbf{d}_i]^2 + \frac{(n+1) B}{2} n \max_i [\mathbf{z} \cdot \mathbf{d}_i]&\\
        %
        & \le n^2 (A+B) \max_i [\mathbf{z} \cdot \mathbf{d}_i]^2&\\
        & \le n^2 m (\max_r a_r + \max_r b_r) \big( k \max_j z_j \big)^2.&
    \end{align*}
    
    Therefore, the number of iterations is $\mathcal{O}(m k^2 \times \text{poly}(n))$.
    \end{proof}
\end{appendixonly}

%xxxxxx include in journal
%\subsection{Special Case: $k$-Class Congestion Games ($k$-CCG)}
%In a $k$-CCG, each demand vector has exactly one non-zero element. As a result, a $k$-CCG is a special case of a $k$-DCG, and all $k$-DCG properties and algorithms apply to $k$-CCGs. We can, however, provide an improved analysis of Algorithm~\ref{alg:BR} for $k$-CCGs. Here, we assume that $\mathbf{z} > 0$; otherwise, if $z_j = 0$ for any $j \in [k]$, we can exclude from the game any player using the $j$-th ``class'' in their demand vector without altering the preferences of the other players.


\begin{theorem}
    For linear-cost $k$-CCGs, the best-response algorithm 
    %Algorithm~\ref{alg:BR} 
    runs in polynomial time if $\max_r a_r$, $\max_r b_r$,  $\frac{\max_j z_j^2}{\min_j z_j}$, and $\frac{\max_i d_{i,l(i)}^2}{\min_i d_{i, l(i)}}$  are polynomial in $n$, where $l(i) \in [k]$ denotes the index of the non-zero element in $\mathbf{d}_i$.
\end{theorem}
\begin{appendixonly}
\begin{proof}
We get $\max_i [\mathbf{z} \cdot \mathbf{d}_i] = \max_i \big[ z_{l(i)} d_{i, l(i)} \big] \le$\\$\big( \max_i z_{l(i)} \big) \big( \max_i d_{i, l(i)} \big) \le \big(\max_j z_j \big) \big( \max_i d_{i, l(i)} \big)$. \\
Similarly, $\min_i [\mathbf{z} \cdot \mathbf{d}_i] \ge \big(\min_j z_j \big) \big( \min_i d_{i, l(i)} \big)$.\\ %We obtain the following from 
Using Theorem~\ref{thm:running_time_alg_BR}, the number of iterations is at most\\$n^2 m (\max_r a_r + \max_r b_r) \frac{\max_j z_j^2}{\min_j z_j} \frac{\max_i d_{i,l(i)}^2} {\min_i d_{i, l(i)} }$.
%
\begin{align*}
    &\text{Number of iterations } &\\
    & \le n^2 m (\max_r a_r + \max_r b_r) \frac{\max_j z_j^2}{\min_j z_j} \frac{\max_i d_{i,l(i)}^2} {\min_i d_{i, l(i)} }.&
\end{align*}

Therefore, the number of iterations is $\mathcal{O}(n^2 m \times \text{poly}(n))$.
\end{proof}
\end{appendixonly}

\begin{paperonly}
Please note that polynomial-time algorithms for linear cost are unlikely to exist due to the PLS-completeness of unweighted network congestion games with linear cost \citep{ackermann2008impact}.

\subsubsection*{Experiments}
We have performed experiments to evaluate the effect of the dimension $k$ and the number of resources on the running time of the algorithm given by Theorem~\ref{thm:running_time_alg_BR}. We vary $k = 2, 3, 4$ and the number of links $m$ in a parallel network from 2 to 10. We also vary the number of players $n$ from 5 to 100. In the iterative best-response algorithm, we apply a tweak suggested by \cite{panagopoulou_algorithms_2007} that prioritizes players with relatively high impacts on the cost function due to their demand vectors.

Our experiments show that the PSNE computation time of Theorem~\ref{thm:running_time_alg_BR} scales up gracefully as we increase the number of players and links. This is perhaps not surprising given the pseudopolynomial running time of the algorithm. Furthermore, our experiments are consistent with those on single-dimensional weighted congestion games \citep{panagopoulou_algorithms_2007}. Details, including figures, are in the Appendix.
\end{paperonly}

\begin{appendixonly}
%\hl{\noindent 
\textbf{Experimental Results.}
%}
%\hl
{We have performed experiments to evaluate the performance of Algorithm} \ref{alg:BR}. 
%\hl
{The cost function is $c_r(\mathbf{x}_r(\mathbf{s}))= a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$, where $a_r$ and $b_r$ are random integers between 0 and 5, and $\mathbf{z}$ is a vector of random integers between 0 and 5 (all inclusive). Every player has a random demand vector where each element is between 0 and 5 (both inclusive except that the demand vector cannot be all zeros). In the implementation of the algorithm, we order the players $i$ from the highest to lowest value of $\mathbf{z} \cdot \mathbf{d}_i$.}

%\hl
{We vary the number of dimensions $k = 2, 3, 4$ and the number of links from 2 to 10. We also vary the number of players from 5 to 100. Figure} \ref{fig:br_experiments} %\hl
{illustrates our experimental results for a subset of representative experiments. It shows that Algorithm} \ref{alg:BR} 
%\hl
{scales up gracefully as we increase the number of players and links. This is perhaps not surprising given the pseudopolynomial running time of the algorithm. Our experiments are consistent with those on single-dimensional weighted congestion games} \citep{panagopoulou_algorithms_2007}. %\hl{.}
\end{appendixonly}

\iffalse
Here are the parameters I propose:
Number of players: 5 to 100
Number of dimensions (i.e. k): 2, 3, 4
Number of links: 2 to 10
Algorithms to run: Best response (players ordered from highest weight to lowest weight), Best response (players ordered from lowest weight to highest weight)
Cost for each resource r \in R is this linear function: cost_r = \alpha_r * z_vector * aggregate_demand_vector + \beta_r
\alpha_r and \beta_r are random integers between 0 and 5
Every resource r has its own \alpha_r and \beta_r
z_vector is a vector of random integers between 0 and 5
There is only one z_vector for the game.
Every player has a random demand vector where each element is between 0 and 5 (Except it cannot be all zeros.).
Number of trials per combination of parameters: 50
\fi 

\begin{appendixonly}
\begin{figure*}
     \centering
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_4_k_2.png}}
         %\caption{$y=x$}
         %\label{fig:y equals x}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_4_k_4.png}}
         %\caption{$y=3\sin x$}
         %\label{fig:three sin x}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_6_k_2.png}}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_6_k_4.png}}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_8_k_2.png}}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_8_k_4.png}}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_10_k_2.png}}
     \end{subfigure}
     \hfill
     \begin{subfigure}[b]{0.45\textwidth}
         \centering
         \includegraphics[width=\textwidth]{{charts/BR_link_10_k_4.png}}
     \end{subfigure}
     \hfill
    \caption{%\hl
    {Performance of the learning dynamics algorithm for linear cost functions. The figures show that Algorithm} \ref{alg:BR} 
    %\hl
    {scales up nicely with an increasing number of links and dimension $k$. Instead of selecting the players in a linear order, we prioritize players with higher weights according to $\mathbf{z} \cdot \mathbf{d}_i$ for player $i$. This leads to a greater impact on the potential function.}}
    \label{fig:br_experiments}
\end{figure*}
\end{appendixonly}

\iffalse
\begin{appendixonly}
    \begin{figure*}
    \centering
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_4_k_2.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_4_k_4.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_6_k_2.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_6_k_4.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{{charts/BR_link_8_k_2.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_8_k_4.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_10_k_2.png}}
    \subfigure{\includegraphics[width=0.45\textwidth]{charts/BR_link_10_k_4.png}}
    \caption{Caption Here}
    \label{fig:br_experiments}
\end{figure*}
\end{appendixonly}
\fi


\subsection*{Exponential Cost Functions}
%\mylabel{sec:exp}
\iffalse
\begin{paperonly}
%Building off prior work \citep{panagopoulou_algorithms_2007,harks2011characterizing,harks_existence_2012,klimm_equilibria_2022}, we consider 
Below is our result for
exponential cost $c_r(\mathbf{x}_r(\mathbf{s})) \equiv  a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) + b_r$. Details are in the Appendix.
\end{paperonly}
\fi 
\begin{paperonly}
%xxxxxxx include in journal
%As we have discussed in the previous section, a slightly more generalized linear cost function of the shape $c_r(\mathbf{x}_r(\mathbf{s})) = [\mathbf{z}_r \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$ (compared to $c_r(\mathbf{x}_r(\mathbf{s})) = a_r [\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ] + b_r$) precludes a potential function. Despite this, interestingly, a family of exponential cost functions yields a potential function. 
For one-dimensional weighted congestion games, it has been shown that the uniform exponential cost function $c_r({x}_r(\mathbf{s})) = \exp({x}_r(\mathbf{s}))$ leads to a potential function \citep{panagopoulou_algorithms_2007}. For (one-dimensional) weighted congestion games, this result has been extended to non-uniform exponential functions of the shape $c_r({x}_r(\mathbf{s})) = a_r \exp({x}_r(\mathbf{s})) + b_r$ \citep{harks2011characterizing,harks_existence_2012}. For $k$-DCGs, it has been shown that games with cost functions of the shape $c_r(\mathbf{x}_r(\mathbf{s})) = a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) + b_r$ are isomorphic to one-dimensional congestion games \citep{klimm_equilibria_2022}. 
%xxxxx include in journal.
%Klimm and Sch{\"u}tz \citep{klimm_equilibria_2022} do not focus on the potential function for multidimensional weighted congestion games, but it is easy to derive it by connecting the multidimensional case to the potential function for one-dimensional weighted congestion games given by Harks \textit{et al.} \citep{harks2011characterizing}.%\footnote{We make a small correction to \citep{harks2011characterizing} (pg. 64): The potential function $\Tilde{P}(x)$ for one-dimensional weighted congestion games should be defined as $\sum_{f \in F} c_f(x) + \sum_{i \in N} \sum_{f \in x_i} \frac{e^{\phi d_i} - 1}{e^{\phi d_i}} b_f.$}
For this cost function, we use \cite{harks2011characterizing}'s results on 1-DCGs to derive a potential function for $k$-DCGs, which ultimately leads to the following result. Details, including the intermediate steps, are in the Appendix.
\end{paperonly}

\begin{appendixonly}
We know that $k$-DCGs with cost functions of the shape $c_r(\mathbf{x}_r(\mathbf{s})) = a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) + b_r$ are isomorphic to one-dimensional congestion games \citep{klimm_equilibria_2022}. 
%xxxxx include in journal.
%Klimm and Sch{\"u}tz \citep{klimm_equilibria_2022} do not focus on the potential function for multidimensional weighted congestion games, but it is easy to derive it by connecting the multidimensional case to the potential function for one-dimensional weighted congestion games given by Harks \textit{et al.} \citep{harks2011characterizing}.%\footnote{We make a small correction to \citep{harks2011characterizing} (pg. 64): The potential function $\Tilde{P}(x)$ for one-dimensional weighted congestion games should be defined as $\sum_{f \in F} c_f(x) + \sum_{i \in N} \sum_{f \in x_i} \frac{e^{\phi d_i} - 1}{e^{\phi d_i}} b_f.$}
We use \cite{harks2011characterizing}'s results on 1-DCGs to derive the following potential function for $k$-DCGs.

%As in the previous section, we first show the potential function and then bound it.
%for the purpose of analyzing the best response-based algorithm.
%The following theorem holds for 
%The exponential cost function is $c_r(\mathbf{x}_r(\mathbf{s})) = a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ) + b_r$. % for all $r$. %We use the same notation as in the previous subsection.
\begin{appendixtheorem}% \citep{klimm_equilibria_2022}
    Any multidimensional congestion game with exponential resource costs is a $\mathbf{w}$-potential game.
\mylabel{thm:exp_mdcg_potential}
\end{appendixtheorem}

% \begin{paperonly}
% \begin{proofsketch}
% We show that $\Phi(\mathbf{s})$ is a $\mathbf{w}$-potential function for the choice of $w_i = \frac{1}{1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)}$ for each $i$ where:\\
% %\begin{align}
% $\Phi(\mathbf{s}) = \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})) + \sum_{i \in N} \sum_{r \in s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)).$
% %\nonumber 
% %\end{align}
% \end{proofsketch}
% \end{paperonly}

\begin{proof}
We show that the $\Phi(\mathbf{s})$ defined below is a $\mathbf{w}$-potential function for the choice of $w_i = \frac{1}{1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)}$ for each $i$.
\begin{align}
&\Phi_1(\mathbf{s}) = \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s})). \nonumber \\ 
&\Phi_2(\mathbf{s}) = \sum_{i \in N} \sum_{r \in s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)).\nonumber \\ 
&\Phi(\mathbf{s}) = \Phi_1(\mathbf{s}) + \Phi_2(\mathbf{s}). \label{eq:potential_func_exp}
\end{align}

Consider any set of resources $s_i' \neq s_i$. Define $\mathbf{s}' = (\mathbf{s}_{-i}, s_i')$. For any resource $r$ that is picked either by both $s_i$ and $s_i'$ or none of them: 
\begin{flalign}
& c_r(\mathbf{x}_r(\mathbf{s})) = c_r(\mathbf{x}_r(\mathbf{s'})).% \text{, and}\\
%& c_r(\mathbf{x}_r(\mathbf{s})) \mathbf{x}_r(\mathbf{s}) = c_r(\mathbf{x}_r(\mathbf{s'})) \mathbf{x}_r(\mathbf{s'}).
\label{eqn:cr_equal_exp}
\end{flalign}
%
For any $r \in s_i \setminus s_i'$: 
\begin{align*}
& c_r(\mathbf{x}_r(\mathbf{s})) - c_r(\mathbf{x}_r(\mathbf{s'})) & \\
&= a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) + b_r - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) - b_r &\\
&= a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) - a_r \exp(\mathbf{z} \cdot (\mathbf{x}_r(\mathbf{s})-\mathbf{d}_i)) &\\
&= a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) \exp(-\mathbf{z} \cdot \mathbf{d}_i)) &\\
&= a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)). &
\end{align*}
%
%
Similarly, for any resource $r \in s_i' \setminus s_i$, 
\begin{flalign*}
    &c_r(\mathbf{x}_r(\mathbf{s})) - c_r(\mathbf{x}_r(\mathbf{s'})) \\
    &= a_r \exp(\mathbf{z} \cdot (\mathbf{x}_r(\mathbf{s'}) - \mathbf{d}_i) ) + b_r - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ) - b_r &\\
    &= a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) \exp(- \mathbf{z} \cdot \mathbf{d}_i) - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ) &\\
    &= - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)). &
\end{flalign*}
%
%
The difference in the $\Phi_1$ function under $\mathbf{s}$ and $\mathbf{s}'$ is
\begin{flalign*}
    &\Phi_1(\mathbf{s}) - \Phi_1(\mathbf{s}') &\\
    & = \sum_{r \in R} \Big( c_r(\mathbf{x}_r(\mathbf{s})) -  c_r(\mathbf{x}_r(\mathbf{s'})) \Big) &\\
            & = \sum_{r \in s_i \setminus s_i'}  \Big( c_r(\mathbf{x}_r(\mathbf{s}))  -  c_r(\mathbf{x}_r(\mathbf{s'}))  \Big) + &\\
           &  \quad \quad \sum_{r \in s_i' \setminus s_i} \Big( c_r(\mathbf{x}_r(\mathbf{s})) - c_r(\mathbf{x}_r(\mathbf{s'})) \Big) &\\
           %
            & = \sum_{r \in s_i \setminus s_i'}  \Big( a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i))  \Big) + &\\
           &  \quad \quad \sum_{r \in s_i' \setminus s_i} \Big( - a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) \Big) &\\
           %
            & = (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) \Bigg( \sum_{r \in s_i \setminus s_i'}  a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}))  - &\\
           &  \quad \quad \sum_{r \in s_i' \setminus s_i} a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) \Bigg). &
           %
\end{flalign*}
%
The difference in the $\Phi_2$ function under $\mathbf{s}$ and $\mathbf{s}'$ is
\begin{align*}
    &\Phi_2(\mathbf{s}) - \Phi_2(\mathbf{s}') &\\
    & = \sum_{l \in N} \sum_{r \in s_l} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_l)) - &\\
    & \quad \quad \sum_{l \in N} \sum_{r \in s_l'} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_l)) &\\
    & = \sum_{r \in s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) - \sum_{r \in s_i'} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) &\\
    & \quad \quad \quad \text{ [because only $i$'s strategy changed between $\mathbf{s}$ and $\mathbf{s}'$]} &\\
    & = \sum_{r \in s_i \setminus s_i'} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) - &\\
    &\quad \quad \quad \sum_{r \in s_i' \setminus s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) &\\
    & = (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) \Bigg( \sum_{r \in s_i \setminus s_i'} b_r  - \sum_{r \in s_i' \setminus s_i} b_r \Bigg).
\end{align*}
%
Combining the differences in $\Phi_1$ and $\Phi_2$, following is the difference in the proposed potential function.
\begin{align*}
    &\Phi(\mathbf{s}) - \Phi(\mathbf{s}') &\\
    &=  \Phi_1(\mathbf{s}) - \Phi_1(\mathbf{s}') +  \Phi_2(\mathbf{s}) - \Phi_2(\mathbf{s}') &\\
    &= \big(1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)\big) \Bigg( \sum_{r \in s_i \setminus s_i'}  \Big( a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s})) + b_r \Big) - &\\
        &  \quad \quad \sum_{r \in s_i' \setminus s_i} \Big( a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s'})) + b_r \Big) \Bigg)& \\
    &= \big(1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)\big) \Big( \sum_{r \in s_i \setminus s_i'}  c_r(\mathbf{x}_r(\mathbf{s})) - &\\
    &  \quad \quad \sum_{r \in s_i' \setminus s_i} c_r(\mathbf{x}_r(\mathbf{s'})) \Big)&\\ 
    &= \big(1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)\big) \Big( \sum_{r \in s_i }  c_r(\mathbf{x}_r(\mathbf{s})) - \sum_{r \in s_i' } c_r(\mathbf{x}_r(\mathbf{s'})) \Big) &\\
    & \qquad \qquad \qquad \qquad \qquad \qquad  \qquad \qquad \qquad \text{\ [by Eqn~\ref{eqn:cr_equal_exp}]}&\\
    & = \big(1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)\big) \big( \pi_i(\mathbf{s}) - \pi_i(\mathbf{s'}) \big).
\end{align*}
Therefore,
\begin{flalign}
    \Phi(\mathbf{s}) - \Phi(\mathbf{s}') = \big(1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)\big) \big( \pi_i(\mathbf{s}) - \pi_i(\mathbf{s'}) \big).
\label{eqn:exp_mdcg_potential}
\end{flalign}
\end{proof}
\end{appendixonly}

%We next give an upper bound on this potential function.  %defined in Theorem~\ref{thm:exp_mdcg_potential}. %As defined in the previous subsection, 
%Here, $\mathbf{d}_N = \sum_{i \in N} \mathbf{d}_i$.

\begin{appendixonly}
We next give an upper bound on the potential function defined in Appendix Theorem~\ref{appendix_thm:exp_mdcg_potential}. As defined in the previous subsection, $\mathbf{d}_N = \sum_{i \in N} \mathbf{d}_i$.

\begin{appendixlemma}
    The potential function for multidimensional congestion games with an exponential cost function is upper bounded by $m \exp(\mathbf{z} \cdot \mathbf{d}_N) \max_r a_r + (n+1) m \max_r b_r$.
\mylabel{lem:potential_bound_exp}
\end{appendixlemma}
\begin{proof}
    We get

\begin{align*}
\Phi(\mathbf{s}) &= \sum_{r \in R} c_r(\mathbf{x}_r(\mathbf{s}))+ \sum_{i \in N} \sum_{r \in s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i)) &\\
        &= \sum_{r \in R}  \big(a_r \exp(\mathbf{z} \cdot \mathbf{x}_r(\mathbf{s}) ) + b_r \big) + &\\
        & \qquad \sum_{i \in N} \sum_{r \in s_i} b_r (1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i))&\\
        & \le \sum_{r \in R}  \big(a_r \exp(\mathbf{z} \cdot \mathbf{d}_N ) + b_r \big) + &\\
        & \qquad \sum_{i \in N} \sum_{r \in R} b_r (1 - 0)&\\
        & \le \exp(\mathbf{z} \cdot \mathbf{d}_N ) \sum_{r \in R} a_r + \sum_{r \in R} b_r + &\\
        & \qquad \sum_{i \in N} \big( m  \max_r b_r \big)&\\
        & \le \exp(\mathbf{z} \cdot \mathbf{d}_N ) m  \max_r a_r  + m  \max_r b_r  + n m  \max_r b_r&\\
        & = m \exp(\mathbf{z} \cdot \mathbf{d}_N )  \max_r a_r  + (n+1)m  \max_r b_r.&
\end{align*}
\end{proof}
\end{appendixonly}

\begin{appendixonly}
Following is the running time analysis of 
%Algorithm~\ref{alg:BR} 
the best response algorithm. 
%for exponential cost functions. 
Recall that each iteration runs in $\mathcal{O}\big(nkpm^2\big)$ time.
\end{appendixonly}

\begin{theorem}
    The best-response algorithm 
    %Algorithm~\ref{alg:BR} 
    runs in polynomial time for exponential-cost $k$-DCGs if $\max_r a_r$ and $\max_r b_r$ are polynomial in $n$ and $[\mathbf{z} \cdot \mathbf{d}_N]$ is $\mathcal{O}(\log n)$.
\end{theorem}

\begin{appendixonly}
\begin{proof}
    Using Appendix Theorem~\ref{appendix_thm:exp_mdcg_potential}, whenever a player $i$ reduces its cost by 1, the potential function reduces by $1 - \exp(-\mathbf{z} \cdot \mathbf{d}_i) \ge 1 - \frac{1}{e} = \frac{e - 1}{e}$. %We get the following by using 
    Using Appendix Lemma~\ref{appendix_lem:potential_bound_exp}, the number of iterations is at most $\frac{e}{e-1} \Big( m \exp(\mathbf{z} \cdot \mathbf{d}_N )  \max_r a_r  + (n+1)m  \max_r b_r \Big)$.
%
%    \begin{flalign*}
%        &\text{Number of iterations} &\\
%        &\le \frac{e}{e-1} \Big( m \exp(\mathbf{z} \cdot \mathbf{d}_N )  \max_r a_r  + (n+1)m  \max_r b_r \Big).&
%    \end{flalign*}
    %Therefore, the statement holds.
\end{proof}

%-time algorithm based on best response dynamics.
%\hau{algorithm 3 compute an approximate PNSE in any congestion games; maybe look at using approximation bounds; or look at a recent paper Computing Approximate Equilibria in Weighted Congestion Games via Best-Responses}
\end{appendixonly}

\begin{paperonly}
    Since the cost function is exponential and an exponential term appears directly in the potential function, it is not surprising that in the above result, we need $[\mathbf{z} \cdot \mathbf{d}_N]$ to be $\mathcal{O}(\log n)$ for polynomial running time.
\end{paperonly}

\iffalse
\begin{figure}[htp]
    \centering
    \includegraphics[width=8cm]{charts/regularm2k3.pdf}
    \includegraphics[width=8cm]{charts/asymptoticm2k3.pdf}
    \includegraphics[width=8cm]{charts/regularm4k2.pdf}
    \includegraphics[width=8cm]{charts/asymptoticm4k2.pdf}
    \caption{The charts display runtime in seconds as $n$ increases for a parallel link multi-dimensional congestion game.
    The top two charts show a model where $m = 2$ and $k = 3$.
    The bottom two charts show a model where $m = 4$ and $k = 2$.
    The algorithms are brute force (BF), set-based dynamic program (SDP), table-based dynamic program (TDP), and asymptotic approximation for table-based dynamic program (TDPA).}
    \mylabel{fig:charts}
\end{figure}
\fi 

\subsection*{Approximate PSNE for General Cost Functions}
\begin{paperonly}
Very recently, several algorithms to compute approximate PSNE (in the multiplicative sense) have appeared. For $\alpha \ge 1$, an $\alpha$-PSNE $\mathbf{s}^*$ means that for any player $i$, $\pi_i(\mathbf{s}^*) \le \alpha \pi_i(s_i', \mathbf{s}^*_{-i})$ for all $s_i'$. For polynomial cost functions of maximum degree $\delta$, an algorithm for computing a $(\delta+1)$-approximate PSNE has been given in \citep{caragiannis2021approximate}. This result has been extended to an $n$-PSNE algorithm for monotonic costs \citep{christodoulou2023existence}. 
%The technique used in these works is very interesting. 
The idea is to relate the decrease in cost due to any player's unilateral deviation to the decrease in social cost and reach a local minimum of the social cost. %Because the social cost has a local minimum, when it can no longer be decreased, %it means that 
%we have reached a solution. 
\end{paperonly}
%

\begin{paperonly}
We present an $(\alpha, \beta)$-PSNE algorithm  for general cost. %For this, we first define a term that bounds the degree of non-monotonicity of the congestion functions. 
%XXXXXXXXXXXXX
%CHECK MATH FORMATTING
%Recall that $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N; r \in s_i} \mathbf{d}_i$ for any $\mathbf{s} \in S$. 
Let $\Delta_r \equiv \max\{ \max_{i \in N, \mathbf{s} \in S; r \in s_i } c_r( \mathbf{x}_r(\mathbf{s}) - \mathbf{d}_i) - c_r( \mathbf{x}_r(\mathbf{s}) ), 0\} $ be the maximum non-negative marginal decrease of any player for resource $r$. 
When the congestion function is nondecreasing, $\Delta_r = 0$. 
Otherwise, $\Delta_r > 0$. 
Let $\Delta_{\max} = \max_{r \in R} \Delta_r$.
%We obtain the following result generalizing the result in 
%\citep{christodoulou2023existence}.
The following result generalizes the result in 
\cite{christodoulou2023existence} by removing the monotonicity assumption on the cost function while retaining the non-negative cost assumption. 
\end{paperonly}

\begin{appendixonly}
    In this section, we remove the condition of monotonicity imposed in \citep{christodoulou2023existence} and give an $(\alpha, \beta)$-PSNE algorithm  for arbitrary cost functions. For this, we first define a term that bounds the degree of non-monotonicity of the congestion functions. 

%XXXXXXXXXXXXX
%CHECK MATH FORMATTING
Recall that $\mathbf{x}_r(\mathbf{s}) = \sum_{i \in N; r \in s_i} \mathbf{d}_i$ for any $\mathbf{s} \in S$. Let $\Delta_r = $\\$\max\{ \max_{i \in N, \mathbf{s} \in S; r \in s_i } c_r( \mathbf{x}_r(\mathbf{s}) - \mathbf{d}_i) - c_r( \mathbf{x}_r(\mathbf{s}) ), 0\} $ be the maximum non-negative marginal decrease of any player under the cost function for resource $r \in R$. 
When the congestion function is nondecreasing, $\Delta_r = 0$. 
Otherwise, $\Delta_r > 0$. 
Let $\Delta_{\max} = \max_{r \in R} \Delta_r$.
We obtain the following result that generalizes the result in 
\citep{christodoulou2023existence}.
\end{appendixonly}

\begin{theorem}
\mylabel{thm:approx}
Every $k$-DCG has an $(\alpha, \beta)$-PSNE for $\alpha = n$ and $\beta = (n-1)m\Delta_{\max}$. Furthermore, %an $(\alpha, \beta)$-PSNE 
it can be computed using an iterative algorithm that is guaranteed to converge.
%Furthermore, $(\alpha, \beta)$ iterative best response (Algorithm \ref{}) converges to an $(\alpha, \beta)$-approximate PSNE in finite time. 
\end{theorem}
% \begin{paperonly}
% \begin{proofsketch}
% With $\Pi(\mathbf{s}) = \sum_{i \in N} \pi_i(\mathbf{s})$, we derive the following. % using the idea in \citep{christodoulou2023existence}.
% \begin{flalign*}
% &\Pi(s'_i, \mathbf{s}_{-i}) - \Pi(s_i, \mathbf{s}_{-i}) \le \\
% &n\pi_i (s_i', \mathbf{s}_{-i}) - \pi_i(s_i, \mathbf{s}_{-i}) +  (n-1)m\Delta_{\max}. 
% \end{flalign*} 
% The social cost $\Pi$ has a local minimum, and at any local minimum of $\Pi$, we get an $(\alpha, \beta)$-PSNE for $\alpha = n$ and $\beta = (n-1)m\Delta_{\max}$. We can compute it using an iterative procedure where at each round, if $\pi_i(s_i, \mathbf{s}_{-i}) > n\pi_i (s_i', \mathbf{s}_{-i}) + (n-1)m\Delta_{\max}$ holds for any player $i$ currently playing $s_i$, the player deviates to $s_i'$. 
% \end{proofsketch}
% %As the set of strategy profiles is finite, we eventually reach an $(\alpha,\beta)$-PSNE.
% %either the players cannot improve or reach a total cost function minimizing strategy profile.  
% %\end{proof}
% \end{paperonly}
\begin{appendixonly}
\begin{proof}
Following the idea from \cite{christodoulou2023existence}, we start by providing a bound for the change of other player costs when a player $i$ changes its strategies. 
For any strategy profile $\mathbf{s} = (s_1, ..., s_n) \in S$, $s_i \not= s_i' \in S_i$, and $i \not= l \in N$, we have that  
\begin{align*}
&\pi_l (s_i', \mathbf{s}_{-i}) - \pi_l (s_i, \mathbf{s}_{-i}) \\
& = \sum_{r \in s_l} c_r(\mathbf{x}_r(s'_i, \mathbf{s}_{-i})) - \sum_{r \in s_l} c_r(\mathbf{x}_r(s_i, \mathbf{s}_{-i})) \\
&= \sum_{r \in s_l \cap (s'_i \setminus s_i)} c_r(\mathbf{x}_r(s'_i, \mathbf{s}_{-i})) - c_r(\mathbf{x}_r(s_i, \mathbf{s}_{-i})) +\\ 
&\qquad \sum_{r \in s_l \cap (s_i \setminus s'_i)} c_r(\mathbf{x}_r(s'_i, \mathbf{s}_{-i})) - c_r(\mathbf{x}_r(s_i, \mathbf{s}_{-i})) \\
&\le \sum_{r \in s_l \cap (s'_i \setminus s_i)} c_r(\mathbf{x}_r(s'_i, \mathbf{s}_{-i})) - c_r(\mathbf{x}_r(s_i, \mathbf{s}_{-i})) + &\\
& \qquad \Delta_{\max}|s_l \cap (s_i \setminus s'_i)| \\
&\le \sum_{r \in s'_i } c_r(\mathbf{x}_r(s'_i, \mathbf{s}_{-i})) + m\Delta_{\max} \\
&= \pi_i (s_i', \mathbf{s}_{-i}) + m\Delta_{\max}.
\end{align*}

Above, the first equality is by the definition of player cost functions, the second equality is by removing terms that $s_i'$ do not affect and splitting terms into those that increase or decrease the total weights, the third inequality is noting that the change of each of the second summarization terms is bounded by $\Delta_{\max}$, and the fourth inequality is by dropping the subtracted terms. 

Define $\Pi(\mathbf{s}) = \sum_{i \in N} \pi_i(\mathbf{s})$ to be the social cost of the players under $\mathbf{s}$. 
By summing up all of the inequalities above except player $i \in N$, we have that 
\begin{flalign*}
&\sum_{l \not =i \in N} \Big( \pi_l (s_i', \mathbf{s}_{-i}) - \pi_l (s_i, \mathbf{s}_{-i}) \Big) \le &\\
& \qquad \qquad \qquad \qquad \le (n-1) [\pi_i (s_i', \mathbf{s}_{-i}) + m\Delta_{\max}]\\
&(\Pi(s'_i, \mathbf{s}_{-i}) - \pi_i(s'_i, \mathbf{s}_{-i})) - (\Pi(s_i, \mathbf{s}_{-i}) - \pi_i(s_i, \mathbf{s}_{-i})) \le \\ 
& \qquad \qquad \qquad \qquad (n-1) [\pi_i (s_i', \mathbf{s}_{-i}) + m\Delta_{\max}]\\ 
&\Pi(s'_i, \mathbf{s}_{-i}) - \Pi(s_i, \mathbf{s}_{-i}) \le n\pi_i (s_i', \mathbf{s}_{-i}) - \pi_i(s_i, \mathbf{s}_{-i}) + &\\
& \qquad \qquad \qquad \qquad  \qquad \qquad \qquad \qquad (n-1)m\Delta_{\max}. 
\end{flalign*} 
If $\pi_i(s_i, \mathbf{s}_{-i}) > n\pi_i (s_i', \mathbf{s}_{-i}) + (n-1)m\Delta_{\max}$, then the social cost of the players must strictly decrease by deviating to $s'_i$. 
Because the social cost has a local minima, it follows that any $\mathbf{s}^{opt} \in S$ that minimizes $\Pi$ is an $(\alpha, \beta)$-approximate PSNE for $\alpha = n$ and $\beta = (n-1)m\Delta_{\max}$. 

We can compute an $(\alpha,\beta)$-PSNE using an iterative procedure where at each round, if $\pi_i(s_i, \mathbf{s}_{-i}) > n\pi_i (s_i', \mathbf{s}_{-i}) + (n-1)m\Delta_{\max}$ holds for any player $i$ currently playing $s_i$, the player deviates to $s_i'$. As the set of strategy profiles is finite, we eventually reach an $(\alpha,\beta)$-PSNE.
%either the players cannot improve or reach a total cost function minimizing strategy profile.  
\end{proof}
\end{appendixonly}
\begin{paperonly}
%We can compute an $(\alpha,\beta)$-PSNE using an iterative procedure where 
In the iterative algorithm of Theorem~\ref{thm:approx}, 
at each round, if $\pi_i(s_i, \mathbf{s}_{-i}) > n\pi_i (s_i', \mathbf{s}_{-i}) + (n-1)m\Delta_{\max}$ for any player $i$ currently playing $s_i$, $i$ deviates to $s_i'$. As the set of strategy profiles is finite, we eventually reach an $(\alpha,\beta)$-PSNE. The result is especially useful for small $\Delta_{\max}$ (e.g., noise). % (e.g., when non-decreasing cost functions are perturbed by some small noises in traffic congestion). 
\end{paperonly}
\iffalse
\section{Experiments} 
\mylabel{sec:experiments}

The algorithm given in theorem \ref{thm:exact_enum} is theoretically efficient under some assumptions. Here, we show that it is practically efficient.
To our knowledge, brute force is the only other algorithm guaranteed to work on games of interest: multi-dimensional congestion games with non-monotonic cost functions.
First, we compare two implementations of our algorithm against brute force.
Our algorithm overtakes brute-force at a relatively small value of $n$.
Second, we compare the implementations against the simulated worst-case complexity of the algorithm: $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$.
This shows that, in practice, our algorithm greatly outperforms its asymptotic behavior.

All algorithms were implemented in Python. 
Source code and data can be found in the supplementary material.
Results were obtained on a Linux machine with an Intel\textregistered\: Xeon\textregistered\: E3-1225 @ 3.1 GHz and 24GB of RAM.

\subsection*{Game Generation}


The evaluation was done on a $k$-dimensional parallel link model with $m$ links/resources.  %(Figure \ref{fig:parallel_link}).
Every player chooses one link $r \in R$ from the set of all $m$ links.
Each link $r \in R$ had a non-monotonic cost function of $c_r(\mathbf{x}_r(\textbf{s})) = \alpha_r f_r(\mathbf{x}_r(\textbf{s})) + \beta_r$.
Where $\alpha_r$ and $\beta_r$ are integers drawn uniformly randomly from the range $[0, 100]$.
The non-monotonic component is $f_r(\mathbf{x}_r(\textbf{s})) = f_r^1(\mathbf{x}_r^1(\textbf{s})) + f_r^2(\mathbf{x}_r^2(\textbf{s})) + \cdots + f_r^k(\mathbf{x}_r^k(\textbf{s}))$, where $\mathbf{x}_r^j(\textbf{s})$ is the aggregate demand in the $j$th dimension and $f_r^j$ is the cost of the aggregate demand in the $j$th dimension.
The cost of $f_r^j$ for any given input is an integer drawn uniformly randomly from the range $[0, 100]$.
Every element of the player demand vector $d_{ij}$ was an integer drawn uniformly randomly from the range $[0, q]$.
If every element of the demand vector was 0 then the entire demand vector was discarded and randomly generated again.
For each combination of parameters ($m, k, q$), 15 games were randomly generated and then $n$ players were randomly generated, all using the master seed 2024.
%For example, the 3rd game where $m = 4, k = 2, q = 5, n = 8$ is identical to the 3rd game where $m = 4, k = 2, q = 5, n = 9$ except for the 9th player.

\subsection*{Methods}
The dynamic program was implemented in two ways.
The first method is as described in section \ref{sec:general}.
The second method exploits the sparsity of 1's in the binary table, by replacing the binary table with a hashset.
Both implementations contain the optimization where if a single player is found to have no best response for a configuration (Procedure 1, section \ref{sec:general}) then the algorithm will stop computations on that configuration.
Likewise the brute force implementation has the optimization where as soon as a single player is found who is willing to deviate from a strategy profile then computations for that strategy profile will stop.
For each $n$ the time to enumerate all configurations or strategy profiles (respectively) was measured and averaged across each of the 15 games.
If the average time was less than 10 minutes and $n < 14$ then the 15 games were re-run with $n + 1$ players.
The binary table based dynamic program had the additional constraint that if a level of the binary table consumed more than 1 GB of memory for a single game then execution for that parameter combination would be halted.


In order to chart the asymptotic behavior of the brute force algorithm and the binary table algorithm we had to ensure that these algorithms ran at their big-O speed not faster.
First, all mentioned optimizations were removed.
Furthermore, because the asymptotic behavior of the algorithm in theorem \ref{thm:exact_enum} is based on the size of the binary table, all bits of the binary table were set to 1 for measuring asymptotic behavior.

At each $n$, the average time to check if a strategy profile is an NE was measured.

In order to approximate the speed of the asymptotic brute force algorithm at a large $n$ the average time to check if a strategy profile is a NE was measured.
For any given $n$ (and a combination of other parameters) the average time was multiplied by $m^n$.
To approximate the speed of the asymptotic binary table algorithm at a large $n$ the average time to check if a configuration contains a NE was measured separately for both procedure 1 $z_1$ and procedure 2 $z_2$.
This was done because of memory and time constraints related to binary table size, which only affected procedure 2.
The binary table size was forced to 1000 for each $n$.
The average time was multiplied by $(nq)^{km} (z_1 + \frac{z_2 (nq + 1)^{km}}{1000})$ to approximate the asymptotic runtime.

\fi 

\section{Structured Costs and Demands}
\mylabel{sec:structured}
%So far, we have not considered any particular structural information regarding the cost functions (e.g., how the costs of resources compare with each other) or demand vectors (e.g., whether the players can be ordered by their demand vectors). In this section, we explore computational questions for several structured variants of $k$-DCGs, roughly in the order of more restrictive settings to less.

Our study of structured costs and demands is motivated by a variety of realistic examples of traffic congestion games, where resources represent roads. 
As an example of structured/ordered demands, vehicles can be ordered by their demand vectors representing width, length, weight, etc. (e.g., semis, pickup trucks, SUVs, sedans, and so on). 
A common example of a nondecreasing cost function is more vehicles on the road means higher costs for everyone. Singleton strategies are seen in grid-patterned road networks with parallel roads to go from source to destination \citep{milchtaich_equilibrium_2006}.
% {10.1007/11944874_9}
%Each vehicle selects one such road. 
We also consider structured cost functions-- e.g., different types of roads have different speed limits: highways, county routes, local roads, etc.

\subsection*{Ordered Demand, Nondecreasing Cost, and Singleton Strategies}
Suppose that the players can be ordered according to their demand vectors: $\mathbf{d}_1 \ge \mathbf{d}_2 \ge ... \ge \mathbf{d}_n$ (w.l.o.g.).
%\footnote{%For two $k$-dimensional vectors $\mathbf{u}$ and $\mathbf{v}$, we say 
%$\mathbf{u} \ge \mathbf{v}$ if and only if $u_j \ge v_j$ for $j = 1, ..., k$.} 
Let each player $i$'s set of \emph{singleton} strategies $S_i = \{ \{r\}\ |\ r \in R \}$. %, which we call a set of singleton-resource strategies. 
In addition, assume that the cost functions are nondecreasing. 
%
%xxxxxx include paralle links?
%It is not hard to reduce this setting to a 2-node network congestion game with parallel links \citep{milchtaich_congestion_1996,milchtaich_equilibrium_2006}.
%and that the resources are ordered by their cost functions. That is, w.l.o.g., $c_1(\mathbf{x}) \ge c_2(\mathbf{x}) \ge ... \ge c_m(\mathbf{x})$ for any aggregate demand vector $\mathbf{x}$. 
%In this setting, 
We can compute a PSNE using the greedy best response algorithm, which orders the players from high to low demand and lets them play their best response in that order \citep{milchtaich_equilibrium_2006}. Details are in the Appendix.

\begin{theorem}
    For a $k$-DCG with ordered demand vectors, nondecreasing cost functions, and singleton-resource strategies, a PSNE can be computed in $\mathcal{O}(n \log n + nmk)$ time.
    \mylabel{thm:greedy_br_1}
\end{theorem}
\begin{appendixonly}
\begin{proof}
    We sort and iterate through the players in the order of high to low demand vectors: 1, 2, ..., $n$ (w.l.o.g.). Sorting takes $\mathcal{O}(n \log n)$. At each iteration, a player $j$ chooses the best-response strategy with respect to the choices of the previous players. None of the previous players $i$ would have any incentive to deviate because $\mathbf{d}_i \ge \mathbf{d_j}$ and the cost functions are nondecreasing. That is, if a previous player $i$ could benefit from deviating to $r$, the current player $j$ would have chosen $r$. By keeping track of the aggregate demand vector for each resource, we get the result.
\end{proof}
\end{appendixonly}


\subsection*{Ordered Demand, Nondecreasing Cost, and Shared Strategies}
%In this setting, 
We relax the assumption of singleton-resource strategies. We show that as long as the players have the same set of strategies, we can compute a PSNE efficiently using the greedy best response algorithm. % outlined in Theorem~\ref{thm:greedy_br_1}.

\begin{theorem}
For a $k$-DCG with ordered demand vectors, nondecreasing cost functions, and a shared set of strategies of size $p$, a PSNE can be computed in $\mathcal{O}(n \log n + npmk)$.
\end{theorem}

\begin{appendixonly}
\begin{proofsketch}
    The proof of Theorem~\ref{thm:greedy_br_1} extends from singleton resources to sets of resources because the cost functions are additive over the resources.
\end{proofsketch}
\end{appendixonly}



\subsection*{Structured Cost Functions and Singleton Strategies}
In this scenario, we do not assume any ordering among the demands of the players. Instead, we assume that the cost functions are nondecreasing and that the resources are ordered by their cost functions. That is, w.l.o.g., $c_1(\mathbf{x}) \ge c_2(\mathbf{x}) \ge ... \ge c_m(\mathbf{x})$ for any  $\mathbf{x}$. We also assume that there are constants $\alpha_j \ge 1$ such that $c_{j-1}(\mathbf{x}) = \alpha_j c_{j}(\mathbf{x})$ for any resource $j > 1$ and $\mathbf{x}$. These assumptions mean that some resources are more costly than others and that the costs of the resources are ``nicely separated.'' Finally, we assume singleton-resource strategies. We get the following result.

\begin{theorem}
    For a $k$-DCG with nondecreasing and structured cost functions, where there are constants $\alpha_j \ge 1$ such that $c_{j-1}(\mathbf{x}) = \alpha_j c_{j}(\mathbf{x})$ for any resource $j > 1$ and aggregate demand vector $\mathbf{x}$, and singleton-resource strategies, a PSNE can be computed in $\mathcal{O}(n \log n + nmk)$ time.
\end{theorem}

\begin{appendixonly}
\begin{proofsketch}
We can compute a PSNE in such $k$-DCGs using the greedy best response algorithm. 
We first order the players according to the cost $c_1$ of their demand vectors. W.l.o.g., let $c_1(\mathbf{d}_1) \ge c_1(\mathbf{d}_2) \ge ... \ge c_1(\mathbf{d}_n)$. Note that we are \emph{not} assuming $\mathbf{d}_1 \ge \mathbf{d}_2 \ge ... \ge \mathbf{d}_n$. In fact, the demand vectors may not be comparable at all. We next prove by induction that the same ordering of players (1, 2, ... $n$) applies to the cost function of every resource. Suppose this is true for resource $j-1$. We show it to be true for resource $j$. Consider any two consecutive players $i-1$ and $i$. By assumptions, $c_{j-1}(\mathbf{d}_{i-1}) \ge c_{j-1}(\mathbf{d}_{i})$, $c_{j-1}(\mathbf{d}_{i-1}) = \alpha_j c_{j}(\mathbf{d}_{i-1})$, and $c_{j-1}(\mathbf{d}_{i}) = \alpha_j c_{j}(\mathbf{d}_{i})$. Therefore, $c_{j}(\mathbf{d}_{i-1}) \ge c_{j}(\mathbf{d}_{i})$. 

Therefore, even though we cannot order the demand vectors intrinsically, we are able to order them w.r.t. the cost functions, which is all that matters for greedy best response.
\end{proofsketch}
\end{appendixonly}




% 
% 
%As a starter,
% 
% Special cost function:
% 
% - Cost function cares about one dimension
% 
% - Can assign players to resources to escape cost
% 
% - Can compute NE easily.
% 
% 
%More general version
% 
% - demand vectors of players: ordered
% - cost function: monotonic
% - Cost of the resources are ordered (some resources are more costly than others  -- NEED IT?)
% - $S_i$: singleton resources (parallel links in network CG)
% 
% Sequential BR algorithm: Allocate players (in the order of high to low demand vectors) according to sequential best response. (ref. Schelling game)
% 
% Proof: Show nobody has incentive to deviate.





%xxxxxx include this?
%Note that the assumption of $c_{j-1}(\mathbf{x}) = \alpha_j c_{j}(\mathbf{x})$ can be further relaxed by letting the multiplicative factor $\alpha_j$ grow as we go from $\mathbf{d}_1$ to $\mathbf{d}_n$. 



% Relaxing the singleton resources:

% -----------------------------------

% Assume: all players have the same set of strategies.

% Sequential BR works. Let player i's demand >= player j's demand. If player j selects the same set of resources as player i, there's no incentive for player i to deviate to $s_i'$ because in that case player j would've chosen $s_i'$.
 
% Look into k-Multiclass CG:
% Assume only 2 nonzero elements in each demand vector (only 1 would have been k-CCG). mC2 different classes of players. Each class- players have the same nonzero element indices. 


\begin{paperonly}
\section{Conclusion}
We have conducted a thorough computational study of $k$-DCGs and their variants using two different computational methods: CSP and learning dynamics. These two computational approaches are driven by whether or not a PSNE is guaranteed to exist in a class of $k$-DCGs.
We prove the hardness of some very special cases and give polynomial-time algorithms for various problems under certain assumptions. Our CSP-based framework is applicable to general (potentially non-monotonic) cost functions for $k$-DCGs and their variants. We also give pseudo-polynomial time algorithms based on learning dynamics for linear and exponential cost functions. We extend the learning dynamics approach to the study of approximation algorithms for general cost functions and exact algorithms for various types of structured demands and costs.

In particular, our CSP framework, which has not been studied before within the extremely rich congestion games literature, holds promise for future research within and outside of congestion games. We are particularly interested in designing and implementing CSP-inspired search algorithms for network congestion games, such as backjumping (Gaschnig, graph-based, conflict directed, etc.) and learning algorithms \citep{dechter2003constraint}, backtracking with tree decomposition \citep{jegou2003hybrid}, AND/OR search algorithms \citep{marinescu2009and}, etc. We are also interested in exploring some of the widely used solvers because to our knowledge, very large-scale experimental work is yet to be done on congestion games. Beyond the realm of congestion games, our key insight of decoupling players' strategies may have applications in many other game-theoretic problems. 
\end{paperonly}


\begin{paperonly}
    \begin{acknowledgements}
        We thank the reviewers for their kind words and many helpful suggestions. MTI is grateful to the National Science Foundation for support from Award IIS-1910203. HC is supported by the National Institute of General Medical Sciences of the National Institutes of Health (P20GM130461), the Rural Drug Addiction Research Center at the University of Nebraska-Lincoln, and the National Science Foundation under grant IIS-2302999. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
    \end{acknowledgements}
\end{paperonly}

%for searching the configuration space. Advances in AI search \citep{russell2010artificial} give further credence to this promise.  

\iffalse
\begin{paperonly}
\section{Discussion and Outlook}

To conclude, we have conducted a thorough computational study of $k$-DCGs and their variants. %We have proved the hardness of some very special cases and given polynomial-time algorithms for various problems. 
In particular, our CSP framework, which was not studied before within the extremely rich literature on congestion games, holds promise for future research. We are particularly interested in designing CSP-inspired search algorithms \citep{dechter2003constraint} for searching the configuration space. Advances in AI search \citep{russell2010artificial} give further credence to this promise.  
\end{paperonly}
\fi 

\iffalse
%The asymptotic runtime of the dynamic program presented in section \ref{sec:general} is polynomial, $\mathcal{O}((w_{\max})^{km}(nkp^2m^2 + nkmp(w_{\max})^{km}))$ albeit a large polynomial.
%However, this does not mean that the algorithm is not useful in practice.
%To demonstrate the practically of the algorithm we programmed it in Python.
%We also programmed a variant of the algorithm that replaces the binary table from procedure 2 with a hashset.
%Finally we programmed approximation code which normalizes the asymptotic complexity to the machine\footnote{Linux machine with an Intel\textregistered\: Xeon\textregistered\: E3-1225 @ 3.1 GHz and 24GB of RAM} the experiments were run on.


%Our goal was to see how quickly the algorithms could search the entire game for a NE, so we programmed the algorithms to continue execution even after a NE is found.
%This allowed us to examine runtime in the worst case (\textit{i.e.} How long it would take to prove the absence of a NE.).
%The charts in figure \ref{fig:charts} show the time taken to search the entire game.

The charts in figure \ref{fig:charts} convey several things:
1) The algorithm is fast enough in practice to search the entire space of some congestion games; not just find a single NE.
2) In the parallel link model the algorithm vastly outperforms its asymptotic runtime.
Likely because of the relative sparsity of 1's in the binary table.
3) The algorithm serves as a framework for future optimizations.
A single change in data structure results in substantial runtime improvement (\textit{e.g.} the top/bottom chart at $n = ?$ shows an order of magnitude runtime difference between the table-based and hashset-based algorithms).
We leave further optimizations to future work.

Finally the authors know of no other algorithm, besides brute force, that can compute NE for weighted congestion games with multi-dimensional demand vectors or non-monotonic cost functions.
Our algorithm does both simultaneously.
\fi 


\iffalse
\section{Dimensionality Reduction for General Cost Functions}
As discussed in Sections~\ref{sec:linear} and \ref{sec:exp}, there exist isomorphism results between $k$-DCGs and 1-DCGs when the cost function is either linear or exponential \citep{klimm_equilibria_2022}. Here, we present a technique to reduce any $k$-DCG to an equivalent (i.e., having the same set of PSNE) $l$-DCG for $1 \le l < k$. 

Given a $k$-DCG with $n$ players, as defined in Section~\ref{sec:prelim}, we first divide the $n$ players into $l$ buckets arbitrarily but as evenly as possible. This ensures that each bucket contains at most $\lceil \frac{n}{l} \rceil$ players. We store the players in a bucket in an ordered fashion (e.g., ordered by the player number). We then assign any player $i$ in any bucket $j$ an $l$-dimensional demand vector where all the elements, except the $j$-th element, are $0$. The $j$-th element is assigned $2^{\text{I}_j(i)}$, where $\text{I}_j(i)$ is player $i$'s index in bucket $j$ (indexing starts at 0).

%Define the weight of player $i$ to be $2^{i-1}$, for $1 \le i \le n$. 

Given a strategy profile $\mathbf{s}$, we construct an $l$-dimensional aggregated demand vector $\mathbf{x}_r(\mathbf{s})$ by summing up the $l$-dimensional demand vectors of the players $i$ such that $r \in s_i$.

We now define the cost function for $l$-DCG, which maps an $l$-dimensional aggregated demand vector $\mathbf{x}_r(\mathbf{s})$ for a resource to a real number. The key observation here is that instead of explicitly storing the value of the cost function for each possible $l$-dimensional aggregated demand vector, we can use the cost function of the $k$-DCG to compute this. For this, given any $l$-dimensional aggregated demand vector $\mathbf{x}_r(\mathbf{s})$, we consider each element $(\mathbf{x}_r(\mathbf{s}))_j$, which corresponds to bucket $j$. We then extract the players in that bucket that have $r$ in their strategy by considering the binary representation of $(\mathbf{x}_r(\mathbf{s}))_j$. Once we extract the set of players using $r$, we aggregate their $k$-dimensional demand vectors and use it as the input to the $k$-dimensional cost function to obtain the desired cost.
\fi

\iffalse
\section{Conclusion}
In this paper, we have studied the computational complexities of $k$-DCGs and their variants. We have also designed algorithms for three types of cost functions: (1) general cost function (potentially non-monotonic), for which we have given a CSP framework for congestion games, (2) linear, and (3) exponential cost functions. For the latter two, we have given potential function-based algorithms. We have also studied approximate PSNE for general cost functions. There are multiple exciting avenues for the future, including complexity characterizations for $k$-DCGs, exploring the idea of isomorphism for different types of congestion games for the purpose of algorithm design, and improving the running times of the known algorithms.
\fi
