\section{Univariate Marginals Distribution Algorithm}\label{app:umda}

The Univariate Marginal Distribution Algorithm (UMDA) was proposed by Pelikan and M\"{u}hlebein in 1999~\citep{umda} and is one of the best-known EDAs \citep{ceberio2012review}. The UMDA is a population-based algorithm; thus, during the whole optimization process
it maintains a set of solutions (population), where the objective is to improve the quality of these. Specifically, given a population of integer-valued vectors $x$ of length $n$, at every iteration, a high-quality set of solutions $\mathcal{X}=\{x^1,x^2,\ldots,x^m\}$ is chosen. Normally, these are selected by truncation, where the population is sorted on the basis of the objective function value, and a percentage of the first solutions is chosen. Once $\mathcal{X}$ is selected, the univariate marginal frequencies of its solutions are calculated. This is done by counting the number of times a specific item appears in a certain position in the solutions $x\in\mathcal{X}$. The probability of a solution $x$ is calculated as,
\begin{equation}
    P(x) = \prod_{i=0}^{n-1}  p(x(i)=j)
    \label{eq:umda}
\end{equation}
where $j$ denotes the item in the \mbox{$i$-th} position of $x$, and \mbox{$p(x(i)=j)$} is the probability of $j$ appearing in the \mbox{$i$-th} position of $x$. This is known as the first-order marginal probability.

Once the parameters of the UMDA are estimated, the next step consists of sampling solutions from this distribution, creating a new population, and returning to the selection step. The classical UMDA samples a new vector $x$ by randomly choosing $x(i)$ with probabilities $p(x(i))$ for each position $1\leq i\leq n$.
Note that in the case of permutations, this sampling method does not guarantee that the sampled vector is a permutation. Consequently, a common approach is to set the probability of choosing an already sampled item in another position to zero, normalizing the rest of the probabilities. Although this guarantees to fulfill the permutation nature of the samples, the sampled probability distribution is no longer the estimated one; in fact, it is unknown. However, as we consider solutions to be inversion vectors that do not require satisfying the mutual exclusivity constraint, sampling can be accomplished without any modifications to the classical approach.

\section{Permutation Problems}\label{app:problems}
The following lines describe the three considered combinatorial optimization problems, the Permutation Flowshop Scheduling Problem (PFSP), Quadratic Assignment Problem (QAP), and Linear Ordering Problems (LOP). Note that the selection of the mentioned problems is driven by the fact that these three problems are very well known in the combinatorial optimization literature, classified as NP-hard problems, having decades of research and work behind them. Moreover, many other optimization problems are special cases of these. 

\noindent\textbf{PFSP.} 
In this combinatorial problem, the goal is to schedule $n$ jobs on $m$ machines to minimize a specified tardiness criterion~\cite{pfsp}. Each job must proceed through each machine without interruption, and all jobs are available at time zero. A job can be processed on machine $j$ only if the operation on machine $j-1$ is complete and machine $j$ is available. Solutions are represented as sequences of length $n$, where job $i$ is scheduled at position $\sigma(i)$. The processing time of a job on a specific machine is determined by a matrix $\mathbb{P}=[p_{ij}]_{n\times m}$.
Although different objective functions exist for the PFSP, in this work, the time that it takes to finish all the jobs, known as the makespan, is minimized. Given a sequence of jobs $\sigma$, the makespan is defined as,
\begin{equation}\label{eq:pfsp-objective}
    f(\sigma) = c_{\sigma(n), m}
\end{equation}
this is equal to the time that it takes to finish the last job ($n$) in the last machine ($m$) and it is recursively calculated as,
\begin{equation}\label{eq:pfsp-makespan}
\resizebox{0.65\hsize}{!}{$
        c_{\sigma(i), j}= 
    \begin{cases} 
        p_{\sigma(i),j} & i=j=1\\
        p_{\sigma(i),j}+c_{\sigma({i-1}),j} & i > 1, j = 1\\
        p_{\sigma(i),j}+c_{\sigma(i), j-1} & i = 1, j > 1\\
        p_{\sigma(i),j}+\max\{c_{\sigma({i-1}),j}, c_{\sigma(i), j-1} \} & i > 1, j > 1
   \end{cases}
$}
\end{equation}

\noindent \textbf{QAP.}
The QAP~\cite{qap} is an assignment problem where $n$ facilities have to be allocated to $n$ different locations on a map while minimizing the cost function given by Eq.~\ref{eq:qap}. 
For each pair of positions $i$ and $j$, $d_{i,j}$ is the distance between them; and each pair of facilities is associated with a flow parameter $h_{k,l}$. The distances and flows among the facilities are represented by two real-valued matrices $D=[d_{i,j}]{n\times n}$ and $H=[h_{k,l}]_{n\times n}$, respectively. The objective value of any solution (allocation) $\sigma$ is calculated as:
% The QAP~\cite{qap} is an assignment or location analysis type problem.
% It consists of allocating $n$ facilities in $n$ different locations in the map, while minimizing the cost function Eq.~\ref{eq:qap}. For each pair of positions $i$ and $j$ there is a distance $d_{i,j}$ parameter. In addition, for each pair of facilities there is a flow parameter $h_{k,l}$ associated. The sets of distances and facilities are described by the matrices of real values $D=[d_{i,j}]_{n\times n}$ and $H=[h_{k,l}]_{n\times n}$, respectively. The objective value of any solution (allocation) $\sigma$ is calculated as:
\begin{equation}\label{eq:qap}
    f(\sigma) = \sum_{i=1}^{n}\sum_{j=1}^{n} d_{i,j}h_{\sigma(i), \sigma(j)}
\end{equation}

\noindent\textbf{LOP.} 
% In this problem, given a square integer matrix $\mathbb{B} = [b_{ij}]_{n\times n}$, the objective is to find the simultaneous permutation of rows and columns in the matrix $B$ that maximizes the sum of the upper diagonal elements of $B$ \cite{lop}. Solutions are codified as permutations of length $n$, where the element in the $i$-th position of the solution $\sigma_i$ determines that the $\sigma_i$-th row and column in $B$ are reallocated to the $i$-th row and column.
Given a square integer matrix $\mathbb{B} = [b_{ij}]_{n\times n}$, the objective in the LOP is to find a simultaneous permutation of rows and columns in $B$ that maximizes the sum of its upper diagonal parameters~\cite{lop}. Solutions are represented as permutations of length $n$, where the element in the $i$-th position of the solution $\sigma(i)$ determines that the $\sigma(i)$-th row and column in $B$ are reallocated to the $i$-th row and column.
Accordingly, the objective function of the LOP is defined as:
\begin{equation}\label{eq:lop-objective}
    f(\sigma) = \sum_{i=1}^{n-1} \sum_{j=i+1}^n b_{\sigma(i),\sigma(j)}
\end{equation}
