%\documentclass{uai2022} % for initial submission
\documentclass[accepted]{uai2022} % after acceptance, for a revised
                                    % version; also before submission to
                                    % see how the non-anonymous paper
                                    % would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2022} % ptmx math instead of Computer
                                         % Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2022} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams
\usetikzlibrary{arrows.meta}

\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{amssymb}

\usepackage{xfrac}

\usepackage[linesnumbered]{algorithm2e} % noend
\setitemize{noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt}

%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example

\newtheorem{theorem}{Theorem}[section]
\newtheorem{definition}[theorem]{Definition}
\newtheorem{problem}[theorem]{Problem}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{fact}[theorem]{Fact}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{conjecture}{Conjecture}

\theoremstyle{definition}
\newtheorem{example}{Example}

\newcommand{\Pa}{\textit{Pa}} 
\newcommand{\Ch}{\textit{Ch}} 
\newcommand{\Ne}{\textit{Ne}} 
\newcommand{\Si}{\textit{Si}} 
\newcommand{\An}{\textit{An}} 
\newcommand{\De}{\textit{De}}
\newcommand{\Dis}{\textit{Dis}}
\newcommand{\Discr}{\textit{Discr}} 
\newcommand{\bidirected}{\leftrightarrow}

% Problems
\newenvironment{parameterizedproblem}%
{%
  \leavevmode\nobreak\par
  \begin{list}%
    {}%
    {%
      \def\labelstyle{\itshape}
      \setlength{\topsep}{0pt}%
      \renewcommand\makelabel[1]{%
        \mbox{\normalfont ##1}\hfil
      }%
      \settowidth{\labelwidth}{\labelstyle Parameter:}%
      \setlength{\leftmargin}{\labelwidth}%
      \addtolength{\leftmargin}{\labelsep}%
      \setlength{\itemsep}{0pt}%
      \setlength{\parsep}{0pt}%
    }%
      \def\instance{\item[\labelstyle Instance:]}%
      \def\parameter{\item[\labelstyle Parameter:]}%
      \def\question{\item[\labelstyle Question:]}%
      \def\result{\item[\labelstyle Result:]}%
    }%
    {%
  \end{list}%
}

\newcommand{\Lang}[1]{\text{\normalfont\textsc{#1}}}

% Advantage plots
\def\axis#1{\draw[semithick, |-|] (0,0) -- (#1,0);}
\def\instance#1#2#3{ \draw[semithick,color=gray, fill=lightgray] (#1-0.5,0) rectangle (#1+0.5,#2-#3);}
\def\leftcaption{%
  \node[rotate=90] at (-.75,0)    {\small\it Advantage of};
  \node[rotate=90] at (-.25,.75)  {\small\it \textsc{c-src}};
  \node[rotate=90] at (-.25,-.75) {\small\it \textsc{he}};
}
\def\topcaption#1#2{%
  \node[baseline, anchor=west] at (0,#1) {\small\it Random graphs with $n = $};
  \foreach \x/\n in {#2}{
    \node at (\x,#1-.5) {\small$\n$};
  }
}
\def\advantage#1#2#3{\node at (#1,-.25) {\tiny\pgfmathparse{#2-#3}\pgfmathresult s};}
\newcommand{\subexperiment}[2]{{\small\color{gray}\rule{0.2\linewidth}{1pt}\raisebox{-.5ex}{\parbox{.425\linewidth}{\centering\color{black} Experiments for \emph{#1} graphs generated with $k=#2$}}\rule{0.375\linewidth}{1pt}}\\[2ex]}

% various MAG edge types
\newcommand\stararrow[1][1.4em]{\tikz{\draw[shorten >= 1pt, shorten <= 1pt,{Rays[round,n=6]}-{Stealth[round,sep]}] (0,0) -- (#1,0);}}
\newcommand\arrowstar[1][1.4em]{\tikz{\draw[shorten >= 1pt, shorten <= 1pt,{Stealth[round,sep]}-{Rays[round,n=6]}] (0,0) -- (#1,0);}}
\renewcommand\rightarrow[1][1.4em]{\tikz[baseline=-0.5ex, shorten <=2pt, shorten >=2pt] \draw[-{Stealth[round,sep]}] (0,0) -- (#1,0);}
\renewcommand\leftarrow[1][1.4em]{\tikz[baseline=-0.5ex, shorten  <=2pt, shorten >=2pt] \draw[{Stealth[round,sep]}-] (0,0) -- (#1,0);}
\renewcommand\leftrightarrow[1][1.4em]{\tikz[baseline=-0.5ex, shorten  <=1pt, shorten >=1pt] \draw[{Stealth[round,sep]}-{Stealth[round,sep]}] (0,0) -- (#1,0);}


\makeatletter
\newcommand*{\indep}{%
  \mathbin{%
    \mathpalette{\@indep}{}%
  }%
}
\newcommand*{\nindep}{%
  \mathbin{%                   % The final symbol is a binary math operator
    %\mathpalette{\@indep}{\not}% \mathpalette helps for the adaptation
    \mathpalette{\@indep}{/}%
                               % of the symbol to the different math styles.
  }%
}
\newcommand*{\@indep}[2]{%
  % #1: math style
  % #2: empty or \not
  \sbox0{$#1\perp\m@th$}%        box 0 contains \perp symbol
  \sbox2{$#1=$}%                 box 2 for the height of =
  \sbox4{$#1\vcenter{}$}%        box 4 for the height of the math axis
  \rlap{\copy0}%                 first \perp
  \dimen@=\dimexpr\ht2-\ht4-.2pt\relax
      % The equals symbol is centered around the math axis.
      % The following equations are used to calculate the
      % right shift of the second \perp:
      % [1] ht(equals) - ht(math_axis) = line_width + 0.5 gap
      % [2] right_shift(second_perp) = line_width + gap
      % The line width is approximated by the default line width of 0.4pt
  \kern\dimen@
  \ifx\\#2\\%
  \else
    \hbox to \wd2{\hss$#1#2\m@th$\hss}%
    \kern-\wd2 %
  \fi
  \kern\dimen@
  \copy0 %                       second \perp
}
\makeatother

\title{A New Constructive Criterion for Markov Equivalence of MAGs}

% The standard author block has changed for UAI 2021 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
% TODO
\author[1]{\href{mailto:<wienoebst@tcs.uni-luebeck.de>?Subject=Your UAI 2022 paper}{Marcel Wienöbst}{}}
\author[1]{Max Bannach}
\author[1]{Maciej Li\'{s}kiewicz}
% Add affiliations after the authors
\affil[1]{%
  Institute for Theoretical Computer Science \\
  University of L\"{u}beck \\
  Germany
}
  
\begin{document}
\maketitle

\begin{abstract}
  Ancestral graphs are an important tool for encoding causal knowledge
  as they represent uncertainty about the presence of latent
  confounding and selection bias, and they can be inferred from
  data. As for other graphical models, several maximal
  ancestral graphs (MAGs) may encode the same statistical information
  in the form of conditional independencies.  Such MAGs are said to be
  \emph{Markov equivalent}. This work concerns graphical
  characterizations and computational aspects of Markov equivalence
  between MAGs. These issues have been studied in past years
  leading to several criteria and methods to test Markov
  equivalence. The state-of-the-art algorithm, provided by Hu and
  Evans [UAI 2020], runs in time $O(n^5)$ for instances with $n$
  vertices. We propose a new constructive graphical criterion for the
  Markov equivalence of MAGs, which allows us to develop a practically
  effective equivalence test with worst-case runtime
  $O(n^3)$. Additionally, our criterion is expressed in terms of
  natural graphical concepts, which is of independent value.
\end{abstract}



\section{Introduction}\label{sec:intro}
Graphical causal models represent random variables as vertices of a
graph and express causal effects of one variable on another with
edges. Using the graphical approach allows an intuitive formalism to
explore complex causal
phenomena~\cite{spirtes2000causation,Pearl2009,koller2009probabilistic}.
Another strength of this approach is the ability to tackle causal
problems using algorithmic tools, paving the way towards automated
causal inference and data science.

A popular and commonly used model to encode causal knowledge,
which can be inferred from data, is a \emph{directed acyclic graph} (DAG).
A DAG can be learned from conditional independence (CI) 
statements, if one assumes faithfulness, that is, if the CIs among 
the variables are equal to those expressed through $d$-separations
in the DAG \cite{spirtes2000causation}.
However, multiple DAGs can imply the same CI statements. For
instance, if for the variables $X_a, X_b, X_c$ the only 
CI relationship is $X_a \indep X_c \mid X_b$, then there are
three DAGs $a\rightarrow b \rightarrow c$, $a\leftarrow b\rightarrow
c$, and $a\leftarrow b\leftarrow c$, which 
encode the CI. We say that such DAGs are \emph{Markov equivalent}
and that they belong to the same \emph{Markov equivalence class} (MEC). 
Markov equivalent DAGs encode the same conditional independances via $d$-separations
and are, thus, indistinguishable 
on the basis of observational CIs alone.

Key results for these concepts are the
graphical criterion for two DAGs to be Markov equivalent 
\citep{verma1990equivalence,frydenberg1990chain}
and the graph-theoretic characterization of MECs as so-called 
CPDAGs \citep{andersson1997markov}. Subsequent work in this field
resulted in further achievements, e.\,g.,~regarding causal structure
identification from data
\citep{meek1997graphical,spirtes2000causation,chickering2002learning,chickering2002optimal}
or causal inference and analysis based on Markov equivalence
classes~\citep{maathuis2009estimating, van2016separators,wienobst2021polynomial}.
However, things are more complicated when hidden and selective variables 
emerge~--~as is often the case in practice. 
Useful in this setting are (maximal) ancestral graphs
(AGs, MAGs) introduced by \cite{richardson2002ancestral},
which can represent uncertainty about the presence of latent
confounding and selection bias, and which can be inferred from data.
%
\begin{figure}
  \begin{center}
  \begin{tikzpicture}[xscale=1.2, >={Stealth[round,sep]}]
    \node (G) at (-0.7,-0.5) {$G_1$};
    \node (a) at (0,0) {$a$};
    \node (b) at (1,0) {$b$};
    \node (c) at (0,-1) {$c$};
    \node (d) at (1,-1) {$d$};
    \node (u) at (0.3,-0.5) {$u$};

    \draw[->] (a) to (b);
    \draw[->] (c) to (d);
    \draw[->] (u) to (b);
    \draw[->] (u) to (d);
 
    \node (GG) at (-0.7+3.5,-0.5) {$G_2$};
    \node (aa) at (0+3.5,0) {$a$};
    \node (bb) at (1+3.5,0) {$b$};
    \node (cc) at (0+3.5,-1) {$c$};
    \node (dd) at (1+3.5,-1) {$d$};

    \draw[->] (aa) to (bb);
    \draw[->] (cc) to (dd);
    \draw[<->] (bb) to (dd);

  \end{tikzpicture}
  \end{center}
  \caption{DAG $G_1$ encodes
  the CIs  $X_a\indep \{ X_c, X_d\}$ and $X_c\indep \{ X_a, X_b\}$
  among observed variables ($u$ is a latent variable).  $G_2$ is a MAG which encodes the same CIs.
  The example is from~\citep[Fig.~10]{richardson2002ancestral}.}
    \label{fig:latent:variable}
\end{figure}
%
A variable is latent if it is not measured or recorded. 
For example, the DAG $G_1$ in Fig.~\ref{fig:latent:variable}  
%(the example is taken from~\citep{richardson2002ancestral})
shows a causal structure over four observed variables represented as 
vertices $a, b, c, d$ and a latent variable represented as $u$. 
$G_1$ implies the independence relations $X_a\indep \{ X_c, X_d\}$ and
$X_c\indep \{ X_a, X_b\}$ over the observed variables, i.\,e., after
marginalizing variable $X_u$ out.
However, there is no DAG representing
precisely these CIs, which shows that DAGs are not closed under
marginalization. One can represent the CIs using MAGs as shown by $G_2$  
in Fig.~\ref{fig:latent:variable}.
Additionally, DAGs are not expressive enough for selection variables,
which are unmeasured variables
determining whether a measured unit is included in the data.
%(variables which are conditioned on). 
Hence, DAGs are
not closed under conditioning. In contrast, the class of independence models associated
with AGs, i.\,e., the smallest class that contains the DAG independence
models, is closed under marginalizing and conditioning
(see \citep{richardson2002ancestral} for details). 

Despite many advances, a number of fundamental problems concerning
the properties and algorithmic aspects of this important
model class remain to be  explored. We investigate the
Markov equivalence of MAGs -- one of the basic problems in this field.
As for DAGs,  MAGs that encode the same
conditional independencies are said to be Markov equivalent.
In graphical language, we express CIs via \emph{$m$-separations},  %in AGs,
an extended form of $d$-separation in DAGs (formal definitions are
provided in Section~\ref{sec:prel}). % For Markov equivalent MAGs,
% see Example 2 and 3 in Fig.~\ref{fig:examples}. % I think these
% examples are hard to understand at this point

An effective polynomial-time algorithm to test whether two MAGs
are Markov equivalent has been the subject of intense research.
A na\"{i}ve implementation of the definition requires testing $m$-separation
relations over all pairs of vertices and all subsets of vertices, which takes
exponential time. % in  size $n$ of $V$.
The first graphical criterion was
given by~\citet{spirtes1996polynomial}: The \emph{Spirtes and Richardson Criterion (SRC)}
extends the conditions by \cite{verma1990equivalence} and \cite{frydenberg1990chain}
for DAGs and is based on the useful concept of \emph{discriminating paths}.
The SRC is intuitive and forms the basis of subsequent work.
However, testing the SRC naively requires exponential time since there
can be exponentially many discriminating path, which all have to be inspected.
\citet{zhao2005markov} proposed another characterization using the concept 
of minimal collider paths, which also did not lead to polynomial time.
%
The first criterion that can be checked in
polynomial time has been proposed by
\cite{ali2009markov}. The complexity of their method
is bounded by $O(n \cdot m^4)$ for MAGs with $n$ vertices and $m$ edges.
Recently, a criterion based on parametrizing sets was proposed
by~\citet{hu2020faster}. These sets can be generated in time
$O(n \cdot m^2)$ (for dense graphs with $m \in \Omega(n^2)$ this equates to $O(n^5)$)
leading to a faster algorithm. %However, the criterion is quite involved and verifying its satisfiability with pencil and paper, even for small instances, is not simple.

The main contribution of this paper is a new criterion for the Markov equivalence of MAGs.
It is a simple and constructive variant of the SRC
and allows us to develop an algorithm for equivalence testing in cubic time. This 
breaks the previous $O(n^5)$ worst-case time barrier.
%
Our criterion, coined \emph{constructive-SRC}, is based on
discriminating paths, but it avoids searching through exponentially many paths and
boils down to a simple
graphical condition. The constructive-SRC is intuitive and
checking it by hand is convenient. 
%
% Moreover, it is easy to implement and leads to a
% effective algorithm with worst-case time  $O(n^3)$.
For sparse graphs with maximal degree~$\Delta$, which are common 
in causal modeling, the running time is bounded by $O(n\cdot \Delta^2)$.
We compare our algorithm experimentally
with the algorithm by~\cite{hu2020faster} and show that the
theoretical improvements lead to better practical performance. 

Obtaining the cubic runtime raises the question of whether further
improvements are possible, e.\,g., whether a runtime of $O(n^2)$ can be attained. We
discuss this issue by relating it to the
Markov equivalence of DAGs, where such a runtime is achievable using the CPDAG
representation of Markov equivalence classes. We uncover obstacles in
translating this approach
towards the MAG setting, while also highlighting related open research
questions in this area.

% The paper is organized as follows. In Sec.~\ref{sec:prel}, we 
% give the graphical concepts considered in this
% work and Sec.~\ref{section:history} provides details on previous work. 
% In Sec.~\ref{sec:criterion} we give the constructive-SRC
% and Sec.~\ref{sec:algorithm} describes the algorithm based on the criterion.
% In Sec.~\ref{sec:different:approachs} we discuss an alternative approach to testing 
% Markov equivalence and in~\ref{section:related:problems} some related problems.
% Finally, Sec.~\ref{section:experiments} describes our experimental tests.



\section{Preliminaries}\label{sec:prel}
%\paragraph
%{\bf General Backgrounds.}
A mixed graph $G = (V, E)$ consists of a set of vertices 
and a set of
edges between pairs of vertices. We consider three different edge
types: directed edges $a\rightarrow b$ or $a \leftarrow b$,
bidirected edges $a \leftrightarrow b$, and undirected edges $a - b$.
Vertices linked by an edge of any type are called \emph{adjacent} or
\emph{neighbors}. The \emph{degree} of a vertex is the number of its
neighbors, and the maximum degree of a graph is the
maximum degree of any of its vertices. We call vertices connected by a bidirected edge
\emph{siblings}, and say that $u$ is a \emph{parent} of $v$ if
$u\rightarrow v$ (then $v$ is a \emph{child} of $u$).  A path $\pi$
between two vertices $v_1$ and $v_p$ in $G$ is a sequence of distinct
vertices $\pi=\langle v_1,\ldots,v_p\rangle$ with $p\ge 2$ such that each
vertex $v_i$ is adjacent to $v_{i+1}$ for $i=1,\ldots,p-1$.  A path
of the form $v_1\rightarrow v_2\rightarrow \ldots\rightarrow v_p$ is
directed or causal.  If there is a directed path from $u$ to $v$,
then $u$ is called an \emph{ancestor} of $v$ and $v$ a
\emph{descendant} of $u$.  For a vertex $v$, the set of all of its
ancestors is written as $\An_G(v)$.  The descendant set $\De_G(v)$ is
analogously defined. $\Dis_G(v)$ is the set of vertices in the same
district as $v$, i.\,e., the ones connected to $v$ via bidirected edges.
Also, we denote by $\Pa_G(v)$, $\Ch_G(v)$,
$\Ne_G(v)$, $\Si_G(v)$ the set of parents, children, neighbors,
siblings of $v$ in $G$, respectively.\footnote{We note that $v \in
  \An_G(v)$, $v \in \De_G(v)$ and $v \in \Dis_G(v)$. This does, however, not
  hold for $\Pa_G$, $\Ch_G$, $\Ne_G$ and $\Si_G$.} If $G$ is clear from the
context, we omit it as subscript. These notations generalize to
sets of vertices in the natural way. We denote the subgraph induced by vertex
set $S$ as $G[S] = (S, E \cap (S \times S))$. A graph is
\emph{acyclic} if there is no
directed path from a vertex $u$ to $v$ with $v\rightarrow u$. An
acyclic graph with only directed edges is called a DAG.  The
\emph{skeleton} of  $G$ %graph
 is the graph obtained by replacing every
edge with an undirected one.
%
A \emph{$v$-structure}, also called an \emph{unshielded collider}, is
an ordered triple of vertices $(u,c,v)$ that induces the subgraph
$u\stararrow c \arrowstar v$. The $\ast$ indicates that any edge mark is possible. A vertex $c$ on a path $\pi$ is called
a \emph{collider} if two arrowheads of $\pi$ meet at $c$, e.\,g.~if
$\pi$ contains $ u\bidirected c \leftarrow v $. Two vertices are
\emph{collider connected} if there is a path (a \emph{collider
  path}) between them on which all internal vertices are
colliders; hence, adjacent vertices are collider connected. Vertices
are \emph{$m$-connected} by a set $Z$ if there is a path
$\pi$ between them on which every collider is in
$\An(Z)$ and every node that is not a collider is not in $Z$. Such a
$\pi$ is called an \emph{$m$-connecting path given $Z$}. If vertices $u,v$ are
not $m$-connected by $Z$, written as $(u \indep v \mid Z)_G$,
we say that~$Z$ \emph{$m$-separates} them. Two sets $X, Y$
are $m$-separated by~$Z$ if all their nodes are pairwise $m$-separated
by $Z$. In DAGs, $m$-separation is equivalent to
$d$-separation~\citep{Pearl2009}.

{\bf Ancestral Graphs.} % and Markov Equivalence.}
A graph $G=(V,E)$ is called \emph{ancestral} (AG) if \textit{(i)} it is acyclic,
\textit{(ii)} for every bidirected edge $a \leftrightarrow b$ vertex $a$ is not an ancestor of $b$ (and vice versa), and \textit{(iii)}~for every undirected edge
$a - b$ vertex $a$ (and vertex $b$) have no parents or siblings. %~\citep{richardson2002ancestral}. 
Consequently, ancestral graphs contain at most one edge type between two vertices. 
An AG is a \emph{maximal ancestral graph} (MAG) if  set
$Z$ exists for every pair of nonadjacent
vertices $a$ and $b$ such that $a$ and $b$ are
$m$-separated by $Z$. Every AG can be turned into a MAG by adding bidirected 
edges between vertices that cannot be $m$-separated. %~\citep{richardson2002ancestral}. 
Syntactically, all DAGs are MAGs and all AGs that contain only directed edges are DAGs.
%
%
%%\paragraph

{\bf Markov Equivalence.}
Two AGs %ancestral graphs 
$G_1$ and $G_2$ with the same vertex set $V$ are said
to be \emph{Markov equivalent} if we have for all pairwise disjoint sets
$A,B,Z\subseteq V$ with $A\neq\emptyset$ and $B\neq\emptyset$ that
$A$ and $B$ are $m$-separated given $Z$ in $G_1$ if, and only if, 
$A$ and $B$ are $m$-separated given $Z$ in $G_2$. %~\citep{richardson2002ancestral}.  
The following definition is central
for the study of Markov equivalence of MAGs:
\begin{definition}[\citep{richardson2002ancestral}]
  A path $\pi=\langle x,q_1,\ldots,q_p,b,y\rangle$, $p\ge 1$, % between $x$ and $y$
   is called \emph{discriminating} for vertex $b$ in a MAG $G$  if
  \begin{itemize}
 % \item[(i)] $\pi$ includes at least three edges,
  %\item[(ii)] $b$ is adjacent to $y$ on $\pi$,
  \item[(i)] $x$ is not adjacent to $y$ and
  \item[(ii)] any %every %vertex 
    $q_i, 1\le i\le p$, is %(but not including) $x$ and $b$ is
    a collider on $\pi$ and a parent of $y$. 
  \end{itemize}
\end{definition}

A discriminating path is illustrated in
Fig.~\ref{fig:discriminating:path}.
For vertices $b$ and $y$ in $G$ denote by $\Discr_G(b,y)$ the set of all
discriminating paths $\pi=\langle x,q_1,\ldots,q_p,b,y\rangle$ for $b$.
%
\begin{figure}
  \centering
  \begin{tikzpicture}[xscale=1, >={Stealth[round,sep]}]
    \node (x) at (0,0) {$x$};
    \node (i1) at (1,0) {};
    \node (i2) at (2.2,0) {$\cdots$};
    \node (b) at (3.4,0) {$b$};
    \node (y) at (4.4,0) {$y$};

    \draw[{Rays [n=7]}->] (x) to (i1);
    \draw[<->] (i1) to (i2);
    \draw[<-{Rays [n=7]}] (i2) to (b);
    \draw[{Rays [n=7]}->] (b) to (y);
    \draw[->] (i1) to [bend left] (y);
    \draw[->] (i2) to [bend left] (y);
  \end{tikzpicture}
  \caption{A discriminating path from $x$ to $y$ for $b$.  For the last
    three vertices,
    {\protect\tikz[baseline={(0,-0.05)},>={Stealth[round,sep]}]{
      \protect \node (w) at (0,0) {};
      \protect \node (b) at (1,0) {$b$};
      \protect \node (y) at (2,0) {$y$};
      \protect \draw[<->] (w) to (b);
      \protect \draw[<->] (b) to (y);
      \protect \draw[->] (w) to [bend left=25] (y);
    }},
  {\protect\tikz[baseline={(0,-0.05)},>={Stealth[round,sep]}]{
      \protect \node (w) at (0,0) {};
      \protect \node (b) at (1,0) {$b$};
      \protect \node (y) at (2,0) {$y$};
      \protect \draw[<->] (w) to (b);
      \protect \draw[->] (b) to (y);
      \protect \draw[->] (w) to [bend left=25] (y);
    }}, and
    {\protect\tikz[baseline={(0,-0.05)},>={Stealth[round,sep]}]{
      \protect \node (w) at (0,0) {};
      \protect \node (b) at (1,0) {$b$};
      \protect \node (y) at (2,0) {$y$};
      \protect \draw[<-] (w) to (b);
      \protect \draw[->] (b) to (y);
      \protect \draw[->] (w) to [bend left=25] (y);
    }}
  are possible configurations (see Fact~\ref{fact:byedge}). In the first one $b$ is a
  collider, in the other two a non-collider.}
  \label{fig:discriminating:path}
\end{figure}
%
Our focus lies on the computational complexity of
the following problem:\footnote{We first deal with the problem for MAGs
  \emph{without undirected edges}. We later discuss in
  Section~\ref{section:related:problems} how these can be included with
  minor modifications (our main theorem holds as is).}
\begin{problem}[\Lang{mag-equivalence}]
  \begin{parameterizedproblem}
    \instance Two MAGs $G_1$ and $G_2$.
    \question Are $G_1$ and $G_2$ Markov equivalent?
  \end{parameterizedproblem}
\end{problem}

% From a computational perspective, we are
% interested in efficient algorithm for determining the Markov
% equivalence of MAGs. But we also attach importance to an
% characterization that allows to check the Markov
% equivalence easily \emph{by hand} for smaller examples. To that end,
% the goal of this paper is a intuitive graphical criterion that can be checked efficiently by a computer and
% easy to  examine for a human. %naturally by an expert. 


% Ich denke wir brauchen das gar nicht, wir können diese Infos in
% Sec. 7 geben [MW]
% \paragraph{Markov Equivalence Class (MEC).}
% A MEC of MAGs (DAGs) consists of MAGs (DAGs) over 
% the same set of vertices which are Markov equivalent. Thus,
% every graph in a  MEC encodes the same set of CIs relations 
% among the variables.A MEC of DAGs can be represented by a CPDAG 
% (completed partially directed acyclic graph), which is the union graph
% of the DAGs in the equivalence class it represents
% \citep{andersson1997markov}.
% The class of Markov equivalent MAGs  can be uniquely described by
% a partial ancestral graph (PAG) \citep{zhang2008causal}.

\section{History}\label{section:history}
A graphical criterion for Markov
equivalence of DAGs was provided by 
\citet{verma1990equivalence} and \citet{frydenberg1990chain}:
\begin{theorem}[\citep{verma1990equivalence,frydenberg1990chain}]\label{theorem:classical:criterion:DAGs}
  Two DAGs $G_1$ and $G_2$ are Markov equivalent if, and only if,
  \begin{enumerate}[label=(\roman*)]
  \item $G_1$ and $G_2$ have the same adjacencies and 
  \item $G_1$ and $G_2$ have the same unshielded colliders.
  \end{enumerate}
\end{theorem}

The first graphical criterion for two MAGs to be Markov equivalent was
given by~\citet{spirtes1996polynomial}:
\begin{theorem}[\citeauthor{spirtes1996polynomial}  Criterion (SRC)]\label{theorem:classical:criterion}
  Two 
  MAGs $G_1$ and $G_2$ are Markov equivalent if, and only~if, 
  \begin{enumerate}[label=(\roman*)]
  \item $G_1$ and $G_2$ have the same adjacencies,
  \item $G_1$ and $G_2$ have the same unshielded colliders, and
  \item if $\pi$ forms a discriminating path for $b$ in $G_1$ and $G_2$, 
  then $b$ is a collider on the path $\pi$ in $G_1$ if, and only if, it is a collider on the path $\pi$ in $G_2$.     
  \end{enumerate}
\end{theorem}
% Using our notation we can rewrite the third condition as
% \begin{enumerate}[label=\it(\roman*)]
%     \setcounter{enumi}{2}
%     \item  for all $b,y$ and all $\pi$ in
%     $\Discr_{G_1}(b,y) \cap  \Discr_{G_2}(b,y)$ we have that
%     $b$ is a collider on $\pi$ in $G_1$  if, and only if, $b$ is a collider on $\pi$ in $G_2$. 
% \end{enumerate}
%
Note that it is indeed possible that $G_1$ contains a discriminating
path for $b$ and $y$, which is not present in $G_2$, even in the case of Markov
equivalence (see examples~2 and~3 in Fig.~\ref{fig:examples}). Therefore, testing property
\emph{(iii)} naively requires exponential time as one has
to consider all discriminating paths for variable $b$ (which may be exponentially
many).\footnote{\citet{spirtes1996polynomial}
  claimed that the criterion is testable in time $n^{O(1)}$, which
  was later withdrawn~\citep{ali2009markov}.}

On the quest of finding a \emph{polynomial-time}-checkable criterion for the
Markov equivalence of MAGs, \citet{zhao2005markov} proposed the
following characterization:

\begin{theorem}[\citep{zhao2005markov}] \label{theorem:Zhao:criterion}
  Two MAGs $G_1$ and $G_2$ are Markov equivalent if, and only, if
  $G_1$ and $G_2$ have the same minimal collider 
  paths.\footnote{
  $\pi=\langle v_1, \ldots  , v_p\rangle$ 
  is minimal if there is no order  preserving subsequence 
  $\langle v_1=v_{i_1}, \ldots  , v_t=v_{i_t}\rangle$  that forms a collider path. 
  % (single edges are trivially minimal collider paths).
   }
\end{theorem}

However, this characteristic also does not lead to a polynomial-time
algorithm as there can be exponentially many minimal collider paths.
Subsequently, discernible effort has been made to develop an algorithm
that tests whether two MAGs are Markov equivalent and that runs in
polynomial time~\citep{ali2009markov,hu2020faster}. To
achieve this, the natural formulation in the style of
Theorem~\ref{theorem:classical:criterion} has been abandoned and more
involved criteria without an intuitive graphical interpretation were
introduced.

\citet{ali2009markov} used \emph{triples with order} (if the triple
forms a collider, it is called a \emph{collider with order}). The idea behind
this approach is to consider only the discriminating paths that are
present in any Markov equivalent MAG. While this was an important
contribution towards characterizing Markov equivalence classes of MAGs~\citep{ali2005towards}, the recursive definition of such
triples lacks the graphical intuitiveness of, e.\,g, the SRC.
With significant technical effort, the following criterion was
developed:

\begin{theorem}[Theorem 3.7 in~\citep{ali2009markov}]
  Two MAGs $G_1$ and $G_2$ are Markov equivalent if, and only
  if, they have the same adjacencies and the same colliders
  with order.
\end{theorem}

This  criterion 
led to the sought polynomial-time algorithm.
%as  the authors showed that it can be checked in polynomial time. 
However, the dependency is $O(n \cdot m^4)$ for MAGs
with $n$ vertices and $m$ edges. %, which is $O(n^9)$ for dense graphs.
% \footnote{In theory, for dense graphs with 
% $m \in O(n^2)$ this becomes $O(n^9)$. However, we conjecture that this
% bound is not tight and the runtime is actually a bit better.}

Another criterion was proposed by~\citet{hu2020faster}
based on so-called \emph{parametrizing sets}. As we compare our algorithm
with this approach, we give a brief overview. 
For a  vertex
set $W \subseteq V$, the \emph{barren subset} of $W$ 
is defined as $\mathrm{barren}(W) = \{w \in W \mid \De(w) \cap W =
  \{w\}\}$.~A~set~$H$  %vertex 
   is called a \emph{head} if  %\textit{(i)}
  $\mathrm{barren}(H) = H$ and  %\textit{(ii)}~
  $H$ is contained in a
  single district in $G[\An(H)]$. Let $\mathcal{H}(G)$ be the set of heads
  and define the \emph{tail} of a head as:
  \[
    \mathrm{tail}(H) = (\Dis_{G[\An(H)]}(H) \setminus H) \cup \Pa_G(\Dis_{G[\An(H)]}(H)).
  \] The parametrizing set of MAG $G$ is
  defined as the set 
  $\mathcal{S}(G) = \{H \cup A \mid H \in \mathcal{H}(G) \text{
      and } %\emptyset \subseteq 
      A \subseteq \mathrm{tail}(H)\}$.
\citet{hu2020faster} showed that MAGs $G_1$ and $G_2$ are Markov
equivalent if, and only if, they have the same parametrizing
sets. However, generating these sets is costly as they may have
exponential size. Hence, they consider
$\tilde{\mathcal{S}}_3\subseteq \mathcal{S}$, which only includes sets $S$ of
cardinality 2 and~3, with the vertices in $S$ having 1 or 2 adjacencies.
%Based on such collections of sets, \citet{hu2020faster} show that the following criterion holds:
\begin{theorem}[Corollary 3.2.1 in~\cite{hu2020faster}]
  Two MAGs $G_1$ and $G_2$ are Markov
  equivalent if, and only if, $\tilde{\mathcal{S}}_3(G_1) = \tilde{\mathcal{S}}_3(G_2)$.
\end{theorem}

The sets $\tilde{\mathcal{S}}_3(G)$ can be generated in time
$O(nm^2)$, which is significantly faster than the algorithm
by~\citet{ali2009markov}. However, the criterion in this form is quite
technical and does not lend itself easily to graphical characterizations of
Markov equivalent MAGs.

\section{A Simple Criterion for the Markov Equivalence of MAGs}\label{sec:criterion}
We propose a \emph{constructive} variant of the \citeauthor{spirtes1996polynomial} Criterion (SRC)
%simplification of the graphical criterion
for the Markov equivalence of MAGs. %given  (Theorem~\ref{theorem:classical:criterion}).
This allows us to develop an efficient
equivalence test, improving upon the previous $O(n^5)$ runtime
by~\cite{hu2020faster}.
Additionally, our criterion has a natural graphical
interpretation, which is of independent value. % for
% researchers in the field.
%
We begin with the following fact observed before in
Fig.~\ref{fig:discriminating:path}.

\begin{fact}\label{fact:byedge}
  Let $\pi=\langle x, \dots, q, b, y\rangle$ be a discriminating path
  in a MAG $G$. Then
  $b$ and $y$ are connected either via 
  $b \bidirected y$ or $b \rightarrow y$ and in the former case
  $b$ is a collider on $\pi$, in the latter a non-collider.
\end{fact}

\begin{proof}
  Recall that $q$ is collider and a parent of $y$ and, thus, the edge
  $q\rightarrow y$ is present and the edge between $q$ and $b$ is
  either $q\leftarrow b$ or $q\leftrightarrow b$.
%
  To prove the claim, we first show that $b \leftarrow y$ cannot occur
  and distinguish the two ways the edge between $q$ and $b$ is
  oriented. If $q \leftarrow b$, then we have a directed cycle
  $q\rightarrow y\rightarrow b\rightarrow q$; if $q \bidirected b$ we
  would have $q$ as an ancestor of~$b$, which violates the
  ancestrality property. 

  For the second part, note that $b$ is always a non-collider if $b
  \rightarrow y$. In case of $b \bidirected y$,
  the edge $q \leftarrow b$ cannot occur as $b$ would be an ancestor
  of $y$, violating the ancestrality property. Hence, $b$ is a
  collider in this case.
\end{proof}

%We are ready to state our
%criterion for the Markov equivalence of MAGs, which we will refer to as constructive-SRC:

\begin{theorem}[Constructive-SRC]\label{theorem:simplified:criterion}
  Two MAGs $G_1$ and $G_2$ are Markov equivalent if, and only if, 
  \begin{enumerate}[label=(\Roman*)]
  \item $G_1$ and $G_2$ have the same adjacencies,
  \item $G_1$ and $G_2$ have the same unshielded colliders, and 
  \item for all edges $b\bidirected y\in G_1$ with $\Discr_{G_1}(b,y)
    \neq \emptyset$ we have $b \rightarrow y \not\in G_2$ and vice
    versa.
  \end{enumerate}
\end{theorem}

\begin{proof}
  We first show that %two directions: If 
  if $G_1$ and $G_2$ fulfill the % three
  conditions listed above, %in the constructive-SRC,
   then they
  are Markov equivalent %We show this 
  by arguing that in this case
  SRC is satisfied.
  %the three criteria in Theorem~\ref{theorem:classical:criterion} are satisfied.
%
  The first two conditions are identical. Assume the third one
  holds for the constructive-SRC. Then there is
  no discriminating path for $(x,b,y)$ with
  $b \bidirected y$ in $G_1$ (for the argument we only consider $G_1$ w.l.o.g.) such
  that $b \rightarrow y$ in $G_2$. Hence, it cannot happen that we have
  a discriminating path $\pi$ in $G_1$ and $G_2$ such that $b$ is a
  collider in $G_1$ and a non-collider in $G_2$ (this is \textit{(iii)}
  in the SRC). This is because in
  that case $G_2$ would have $b \rightarrow y$ by Fact~\ref{fact:byedge}.

  For the second direction: Assume $G_1$ and $G_2$ violate one of the
  three conditions \textit{(I)}, \textit{(II)} or \textit{(III)}. We show that they are not Markov
  equivalent. By the SRC, this is
  obvious for \textit{(I)} and \textit{(II)}. Now
  consider that \textit{(III)} is violated but \textit{(I)} and \textit{(II)} are true. 
  Then, w.l.o.g., assume that for some  $b \bidirected y$ in $G_1$
  there is a discriminating path $\pi = \langle x, q_1, \dots, q_p, b, y\rangle$ in $G_1$
  %both graph with $b$ being w.l.o.g.\ a
  and the edge $b\rightarrow y\in G_2$.
%  collider in $G_1$ and a non-collider in $G_2$. It follows that we
%  have $b \bidirected y$ in $G_1$, whereas $G_2$ contains $b \rightarrow y$.
%  
  It follows by the maximality of~$G_1$ and the fact that $x$ and $y$ are
  nonadjacent that there is a set $Z$ such that $(x \indep
  y \mid Z)_{G_1}$. One can easily verify that
  $q_1, \dots, q_p \in Z$ and $b \not\in Z$. Due to the former
  observation it holds that $(x \nindep b \mid Z)_{G_1}$.
%
  On the other hand, %It is easy to
  one can see that $(x \indep y \mid Z)_{G_2}$ and $(x
  \nindep b \mid Z)_{G_2}$ cannot both hold in~$G_2$.
  This is due to the fact that $(x
  \nindep b \mid Z)_{G_2}$ immediately implies $(x \nindep y
  \mid Z)_{G_2}$ as $b$ is a non-collider not contained in $Z$. Hence, $G_1$
  and $G_2$ are not Markov equivalent.
\end{proof}

To illustrate the constructive-SRC, we give three
examples (see Fig.~\ref{fig:examples}) and discuss why or why not
Markov equivalence holds (as all considered pairs of graphs have the
same adjacencies and unshielded colliders, we focus on whether 
\textit{(iii)} of the SRC and  \textit{(III)} of the constructive-SRC are satisfied).


\begin{figure}
  \begin{center}
    \begin{tikzpicture}[xscale=1, >={Stealth[round,sep]}]
      \draw[rounded corners, thick, color=gray] (-0.5,-0.6) rectangle (7.5,.5);
      \node[draw=gray, rounded corners, thick, fill=white, inner sep=0pt, minimum height=0.4cm, baseline, minimum width=2cm] at (1.5,.5) {\small Example 1};
      \node (x) at (0,0) {$x$};
      \node (q) at (1,0) {$q$};
      \node (b) at (2,0) {$b$};
      \node (y) at (3,0) {$y$};
%      \node (l) at (0,0.6) {$G_1$:};
      
      \draw[<->] (x) to (q);
      \draw[<->] (q) to (b);
      \draw[<->] (b) to (y);
      \draw[->] (q) to[bend right] (y);
      
      \node (x) at (4,0) {$x$};
      \node (q) at (5,0) {$q$};
      \node (b) at (6,0) {$b$};
      \node (y) at (7,0) {$y$};
 %     \node (l) at (4,0.6) {$G_2$:};
      
      \draw[<->] (x) to (q);
      \draw[<->] (q) to (b);
      \draw[->] (b) to (y);
      \draw[->] (q) to[bend right] (y);
    \end{tikzpicture}
    \smallskip
    
    \begin{tikzpicture}[xscale=1, >={Stealth[round,sep]}]
      \draw[rounded corners, thick, color=gray] (-0.5,-0.6) rectangle (7.5,.75);
      \node[draw=gray, rounded corners, thick, fill=white, inner sep=0pt, minimum height=0.4cm, baseline, minimum width=2cm] at (1.5,.75) {\small Example 2};

      \node (x) at (0,0) {$x$};
      \node (q) at (1,0) {$q$};
      \node (b) at (2,0) {$b$};
      \node (y) at (3,0) {$y$};
%      \node (l) at (0,0.6) {$G'_1$:};
      
      \draw[->] (x) to (q);
      \draw[<->] (q) to (b);
      \draw[<->] (b) to (y);
      \draw[<->] (x) to[bend left] (b);
      \draw[->] (q) to[bend right] (y);
      
      \node (x) at (4,0) {$x$};
      \node (q) at (5,0) {$q$};
      \node (b) at (6,0) {$b$};
      \node (y) at (7,0) {$y$};
      %\node (l) at (4,0.6) {$G'_2$:};
      
      \draw[<-] (x) to (q);
      \draw[->] (q) to (b);
      \draw[<->] (b) to (y);
      \draw[<->] (x) to[bend left] (b);
      \draw[->] (q) to[bend right] (y);
    \end{tikzpicture}
    \smallskip
    
    \begin{tikzpicture}[xscale=1, >={Stealth[round,sep]}]
      \draw[rounded corners, thick, color=gray] (-0.5,-0.6) rectangle (7.5,.75);
      \node[draw=gray, rounded corners, thick, fill=white, inner sep=0pt, minimum height=0.4cm, baseline, minimum width=2cm] at (1.5,.75) {\small Example 3};

      \node (x) at (0,0) {$x$};
      \node (q) at (1,0) {$q$};
      \node (b) at (2,0) {$b$};
      \node (y) at (3,0) {$y$};
%      \node (l) at (0,0.6) {$G''_1$:};
      
      \draw[<-] (x) to (q);
      \draw[->] (q) to (b);
      \draw[<->] (b) to (y);
      \draw[<-] (x) to[bend left] (b);
      \draw[<->] (q) to[bend right] (y);
      
      \node (x) at (4,0) {$x$};
      \node (q) at (5,0) {$q$};
      \node (b) at (6,0) {$b$};
      \node (y) at (7,0) {$y$};
 %     \node (l) at (4,0.6) {$G''_2$:};
      
      \draw[<->] (x) to (q);
      \draw[<->] (q) to (b);
      \draw[->] (b) to (y);
      \draw[<-] (x) to[bend left] (b);
      \draw[->] (q) to[bend right] (y);
    \end{tikzpicture}
  \end{center}
  \caption{Three examples to illustrate
    the constructive-SRC.
    Example~2 is from~\citep{ali2009markov} and Example~3 is a
    modification of another example therein.}
  \label{fig:examples}
\end{figure}

\begin{example}[Fig.~\ref{fig:examples}]
  %In the first example of Fig.~\ref{fig:examples}, 
  The graphs are not Markov equivalent as the left one contains
  a discriminating path from $x$ to $y$ with $b \bidirected y$ and the right graph contains the edge $b
  \rightarrow y$, which violates condition \textit{(III)} of the  constructive-SRC. %from Theorem~\ref{theorem:simplified:criterion}. 
  % In terms of 
  % Theorem~\ref{theorem:classical:criterion}, 
  In the SRC, condition \textit{(iii)} is not satisfied as the
  discriminating path $\pi = \langle x, q, b, y\rangle$ exists in both graphs
  with $b$ being a collider in the left one and a non-collider in the right one.
\end{example}

\begin{example}[Fig.~\ref{fig:examples}]
  The graphs % in the second example of Fig.~\ref{fig:examples} 
  are Markov equivalent. There is a
  discriminating path $\langle  x, q, b, y \rangle $ in the left graph and it includes the edge
  $b \bidirected y$, but the right graph also contains the edge $b \bidirected y$
  and, hence, \textit{(III)} does not apply. Accordingly, condition \textit{(iii)} of
  the SRC does not apply as 
  $\langle  x, q, b,y \rangle $ is not a discriminating path. An
  advantage of the constructive-SRC is that
  one does not have to check for every discriminating path %~$\pi$ 
  whether it exists in both
  graphs. It is sufficient to check for the existence of such a
  path with collider $b$ in one graph, in combination
  with the edge $b \rightarrow y$ in the other graph.
\end{example}

\begin{example}[Fig.~\ref{fig:examples}]
  The graphs %in the third example of Fig.~\ref{fig:examples} 
  are Markov equivalent as well. There is no
  discriminating path in the left graph, but one in the right graph, namely 
  $\langle  x, q, b,y \rangle $. %This path 
  It contains %the edge 
  $b \rightarrow y$ and, hence, \textit{(III)} does not apply (here, the discriminating
  path needs to contain  %the edge 
  $b \bidirected y$ and the \emph{other}
  graph needs to contain $b \rightarrow y$). Also, \textit{(iii)}
  does not apply because, as stated above,
  there is no discriminating path in the left graph.
\end{example}
%
This third example is
interesting, because it highlights that~\textit{(III)} indeed only
refers to
discriminating paths with $b \bidirected y$. If then $b \rightarrow y$
in the other graph, one can conclude that Markov equivalence does not
hold. If we have a discriminating path with $b \rightarrow y$, even if
$b \bidirected y$ in the other graph, we cannot conclude the
same. However, as we have seen above, condition \textit{(III)} is not
only necessary for Markov
equivalence, it is, together with \textit{(I)} and \textit{(II)}, also
sufficient. This is because \textit{(iii)} in the SRC
could only be violated if we have a discriminating path with a
collider in one graph (hence $b \bidirected y$) and a non-collider in
the other (hence $b \rightarrow y$) and, consequently, \textit{(III)} would be
violated as well.
Hence, for the constructive-SRC, it is not necessary to consider
discriminating paths with non-colliders $b$. This entails a simplification,
which makes \textit{(III)} easier to check by hand compared to previous
formulations (we discuss the 
algorithmic advantages of the constructive-SRC
in the subsequent section) as one only has to look for
discriminating path with collider~$b$. Moreover, it also allows to
simplify the notion of a discriminating path as \emph{a collider path between non-adjacent $x$ and $y$ for
  which every vertex but the one before $y$ is a parent of $y$}. 

We note that \textit{(III)} is a
generalization of the unshielded collider condition \textit{(ii)}. To see this, 
we reformulate the criterion for Markov equivalence of DAGs
(Theorem~\ref{theorem:classical:criterion:DAGs}):\footnote{There are even further formulations of \textit{(III)}, e.\,g., in terms of parameterizing sets, as pointed out by an reviewer: If there is a discriminating path for $\{x,b,y\}$ with non-collider $b$, then the set is parameterizing in both graphs.}

\begin{corollary}[\citep{verma1990equivalence,frydenberg1990chain}] \label{cor:classical:criterion:DAGs}
  Two DAGs $G_1$ and $G_2$ are Markov equivalent if, and only if,
  \begin{enumerate}[label=\textit{(\alph*)}]
  \item $G_1$ and $G_2$ have the same adjacencies and 
  \item if in $G_1$ there is %contains 
    an unshielded collider $x \rightarrow b \leftarrow y$, 
     then $G_2$ does not contain $b \rightarrow y$ and vice versa.
  \end{enumerate}
\end{corollary}
\begin{proof}
  We argue that \emph{(b)} is true if, and only if, $G_1$ and $G_2$
  have the same unshielded colliders (implying that Corollary~\ref{cor:classical:criterion:DAGs}
  is equivalent to
  Theorem~\ref{theorem:classical:criterion:DAGs}). The first direction
  is immediate: if one graph contains the unshielded collider
  $x \rightarrow b \leftarrow y$ while the other graph orients $b
  \rightarrow y$, then clearly $(x,b,y)$ is an unshielded collider in
  only one them.

  For the other direction assume w.l.o.g.\ that $G_1$ contains an unshielded
  collider $(u,v,w)$, but $G_2$ does not. Then $G_2$ has either $u \leftarrow v$ or $v \rightarrow
  w$. In both cases \emph{(b)} is violated (set $b=v$ and either $x = w, y = u$ or $x = u, y = w$).
\end{proof}

% We can now formulate a Markov equivalence criterion for MAGs, which is
% a generalization of the DAG criterion.

\begin{corollary} \label{cor:simplified:criterion}
    Two MAGs $G_1$ and $G_2$ are Markov
    equivalent if, and only if,
  \begin{enumerate}[label=\textit{(\Alph*)}]
  \item $G_1$ and $G_2$ have the same adjacencies,
  \item if there is a collider path $\langle x, \dots, b, y \rangle$  between non-adjacent $x$ and $y$
    with every vertex but $x$, $b$ and $y$ being a parent of $y$ in $G_1$, then $G_2$ does not contain the edge $b  \rightarrow y$ and vice versa.
  \end{enumerate}
\end{corollary}
\begin{proof}
The collider path $\langle x, \dots, b, y\rangle$ may only consist of three
vertices, i.\,e., it could be an unshielded collider. If the
other graph were to contain the edge $b \rightarrow y$, then it would
not have that same collider, meaning the graphs are not Markov
equivalent by \textit{(II)}. If the collider
path consists of more than three vertices, the formulation equals \textit{(III)}.
\end{proof}

We remark that this corollary
applies only to MAGs without undirected edges
(in contrast to the constructive-SRC). However, only minor modifications are necessary to handle
undirected edges as well. We discuss these in Section~\ref{section:related:problems}.

\section{Testing Markov Equivalence of MAGs algorithmically}\label{sec:algorithm}
In the previous section, we derived  a simple 
characterization of Markov equivalence for MAGs. In
this section, we deal with the computational side of the problem and
discuss how this new characterization can be 
tested. The algorithm we propose has a worst-case runtime
of $O(n^3)$, thus being significantly faster than previous
approaches.
Moreover, for sparse graphs, which are very common in
causal modeling, we even report linear time in the number of vertices.

We check the conditions \textit{(I)} and
\textit{(II)} naively. For
checking the third condition \textit{(III)}, we need to test for each $b \bidirected
y$ in $G_k$ with $k \in \{1,2\}$, for which  $b \rightarrow y$ is an
edge in the other graph $G_{k'}$ (with $k' = 3-k$), whether
there is a discriminating path for $b$ and $y$.
We do this by considering every choice of $y$ consecutively, computing
for each the bidirected connected components of its parents (we call
these the \emph{parent districts}) that support our computations.

\begin{definition}
  Given a MAG $G = (V,E)$ and a vertex $y$, the bidirected connected
  components of $G[\Pa(y)]$ are termed the parent districts of $y$ and
  denoted as $\mathcal{D}(y)$.
\end{definition}

This notion is useful as the middle part of a
discriminating path consists solely of such vertices $q_1, \dots, q_p$
in a single parent district of $y$. Once the parent districts have
been computed, one can
check if, for a certain district $D \in \mathcal{D}(y)$, there is a
vertex $x$ non-adjacent to $y$ and a parent or
sibling of $D$, which can function as the start of the discriminating
path. If this is the case, it remains to consider all vertices
$b$ which are siblings of $D$ and $y$. For these, we can conclude that
they are part of a discriminating path $x \stararrow q_1 \bidirected
\dots \bidirected q_p \bidirected b \bidirected y$. If $b
\rightarrow y$ in the other graph, the graphs are not Markov equivalent.
Figure~\ref{fig:alg:example} illustrates this approach and
Algorithm~\ref{alg:checking} gives an implementation.

% for caption, move to preambel later
\setlength\fboxrule{0.8pt}
\setlength\fboxsep{0.8mm}
% ---
\begin{figure}[htbp]
  \centering
  \begin{tikzpicture}[scale=1, >={Stealth[round,sep]}]
    \node (x) at (0,0) {$x$};
    \node (q1) at (1,1) {$q_1$};
    \node (q2) at (1,0) {$q_2$};
    \node (q3) at (1,-1) {$q_3$};
    \node (b1) at (2,0.5) {$b_1$};
    \node (b2) at (2,-0.5) {$b_2$};
    \node[rectangle, draw, line width=0.8pt] (y) at (3,0) {$y$};
    \node (g1) at (0,1) {$G_1$};
    
    \draw[->] (x) to (q2);
    \draw[->] (q3) to (x);
    \draw[<->] (q1) to (q2);
    \draw[<->] (q1) to (b1);
    \draw[<->] (q3) to (b2);
    \draw[<->] (b1) to (y);
    \draw[<->] (b2) to (y);
    \draw[->] (q1) to[bend left] (y);
    \draw[->] (q2) to (y);
    \draw[->] (q3) to[bend right] (y);

    \filldraw[orange, opacity = 0.2, rounded corners] (0.8,1.3)
    rectangle (1.2,-0.3);
    \filldraw[teal, opacity = 0.2] (1,-1) circle (0.2);
    

    \node (x) at (4,0) {$x$};
    \node (q1) at (5,1) {$q_1$};
    \node (q2) at (5,0) {$q_2$};
    \node (q3) at (5,-1) {$q_3$};
    \node (b1) at (6,0.5) {$b_1$};
    \node (b2) at (6,-0.5) {$b_2$};
    \node (y) at (7,0) {$y$};
    \node (g2) at (4,1) {$G_2$};
    
    \draw[<->] (x) to (q2);
    \draw[<-] (q3) to (x);
    \draw[<->] (q1) to (q2);
    \draw[<->] (q1) to (b1);
    \draw[->] (q3) to (b2);
    \draw[->] (b1) to (y);
    \draw[->] (b2) to (y);
    \draw[<->] (q1) to[bend left] (y);
    \draw[->] (q2) to (y);
    \draw[->] (q3) to[bend right] (y);
  \end{tikzpicture}
  \caption{Algorithm~\ref{alg:checking} checking vertex \raisebox{0.5mm}{\fbox{$y$}} in
    $G_1$ (line~\ref{line:yiter}) with $\mathcal{D}(y) =
    \{$ \colorbox{teal!20}{$\{q_3\}$} $,$ \colorbox{orange!20}{$\{q_1,
      q_2\}$} $\}$. For \colorbox{teal!20}{$D = \{q_3\}$},
    the set $\Pa_{G_1}(D) \cup \Si_{G_1}(D) \setminus \Ne_{G_1}(y)$ is
    empty as $x$ is a child of $D$. Hence,
    Algorithm~\ref{alg:checking} does not consider $D$
    further (line~\ref{line:xexists}).
    For \colorbox{orange!20}{$D = \{q_1, q_2\}$}, the set
    $\Pa_{G_1}(D) \cup \Si_{G_1}(D) \setminus \Ne_{G_1}(y)$ 
    contains $x$, which is a parent of $D$ but not a neighbor of~$y$. Moreover, $b_1$ is a sibling of both $D$ and $y$. Hence, we
    obtain the discriminating path $x \rightarrow q_2 \bidirected q_1
    \bidirected b_1 \bidirected y$. As $b_1 \rightarrow y$ in $G_2$,
    the algorithm reports that the graphs are not Markov
    equivalent. Note that for SRC \textit{(iii)} is violated due to
    the discriminating path $x, q_2, q_1, y$ in both graphs with $G_1$
    containing non-collider $q_2 \bidirected q_1 \rightarrow y$ and
    $G_2$ containing collider $q_2 \bidirected q_1 \bidirected
    y$. The discriminating path for $q_1 \bidirected y$ in
    $G_2$  and the corresponding edge $q_1 \rightarrow y$ would also
    be detected by Algorithm~\ref{alg:checking}. Note that here $\{q_2\}$ is a
    parent district of $y$ ($q_1$ is not part of this district as it
    is not a parent of $y$ in $G_2$).}
  \label{fig:alg:example}
\end{figure}


\begin{theorem}\label{theorem:algorithm}
  Algorithm~\ref{alg:checking} checks whether two MAGs are Markov
  equivalent in time $O(n^3)$ for general graphs and expected time
  $O(n \cdot \Delta^2)$ for graphs with maximal degree~$\Delta$.
\end{theorem}

\begin{proof}
  For the correctness of Algorithm~\ref{alg:checking}, we need to show
  that \textit{(III)} of the constructive-SRC is
  correctly checked. If the algorithm returns \emph{Not Markov
    equivalent} in line~\ref{line:not:eq}, then there exists a $b$ and
  $y$ such that $b \bidirected y$ in one graph and $b \rightarrow
  y$ in the other. Moreover, in the former graph there exists a parent
  district $D \in \mathcal{D}(y)$ such that there is a $x \in
  \Pa_{G_k}(D) \cup \Si_{G_k}(D) \setminus  \Ne_{G_k}(y)$ (this set is
  non-empty in line~\ref{line:xexists}) and it is guaranteed that $b$
  is not only a sibling of $y$, but also of $D$. Hence, there is a
  discriminating path $x \stararrow q_1 \bidirected \dots \bidirected
  q_p \bidirected b \bidirected y$, with $q_1, \dots, q_p \in D$ and
  $q_1$ being the sibling/child of $x$ and $q_p$ being the sibling of
  $b$. The collider path from $q_1$ to $q_p$ exists by the definition
  of~$D$. For the same reason, we have that $q_1, \dots, q_p$ are
  parents of $y$. Note, in particular, that $b \neq x$ and both are
  not in $D$. Thus, \textit{(III)} is violated and the output is
  correct.

  For the other direction, if the graph contains a violation
  of \textit{(III)}, then there is a discriminating path for $b \bidirected
  y$ (while the other graph contains $b \rightarrow y$). The existence of
  such a path is detected as all discriminating
  paths for $b \bidirected y$ have the form $x \stararrow q_1
  \bidirected \dots \bidirected q_p \bidirected b \bidirected y$ with
  $q_1, \dots, q_p$ being parents of $y$. Thus, there is a parent
  district of $y$, which has $x$ as parent/sibling and $b$ as a
  sibling. Hence, the algorithm outputs \emph{Not Markov
    equivalent} in line~\ref{line:not:eq}.
  %If \textit{(I)},
  %\textit{(II)} and \textit{(III)} are not violated, the algorithm
  %correctly outputs \emph{Markov equivalent}.

  Regarding the runtime, note that checking \textit{(I)} and
  \textit{(II)} in line~\ref{line:naivecheck} is possible in time $O(n^2)$, resp.\
  $O(n^3)$. If the graph is sparse the runtimes $O(n \Delta)$, resp.\
  $O(n \Delta^2)$, follow (for the latter case 
  consider for each vertex all pairs of its parents and test
  whether they are adjacent.\footnote{\label{footnote:adjcheck} We can perform adjacency tests in $O(1)$ by storing
    the graph as adjacency matrix. For sparse graphs we may avoid
    $O(n^2)$ space by using hash tables, which yields expected time $O(1)$.})
  %
  For checking \textit{(III)}, there are $n$ vertices $y$ considered per graph
  at line~\ref{line:yiter}. Computing the parent districts for one $y$
  can be done in time $O(n^2)$ or $O(\Delta^2)$ if $\Delta$ is the
  maximal degree of the graph, as finding the connected components of a
  (sub)graph with $s$ vertices takes $O(s^2)$ time in the
  worst-case. Hence, this step can be performed in time $O(n^3)$ or
  $O(n \cdot \Delta^2)$. Also the neighbors/parents/siblings of $y$
  and all its parent districts may be precomputed in this phase as well.

  Further, there are two nested for loops, one over the parent
  districts (there are at most $O(n)$ or $O(\Delta)$ many) and one
  over $b$ (again there are $O(n)$ or $O(\Delta)$ choices for
  $b$)\footnote{Forming $\Si_{G_k}(D)
    \cap \Si_{G_k}(y)$ can be done in $O(n)$. To do
    it in expected time $O(\Delta)$ we may again use
    hash tables.}. Finally line~\ref{line:othergraphcheck} can be
  performed in (expected) time $O(1)$ (see
 footnote~\ref{footnote:adjcheck}), yielding again $O(n \cdot
 \Delta^2)$ proving the claim.
\end{proof}
\begin{algorithm}[htbp]
%  \caption{The Clique-Picking algorithm computes the number of
%    acyclic moral orientations of a UCCG $G$.}
  \caption{%Algorithm for 
  Checking the constructive-SRC.}
  \label{alg:checking}
  \SetKwInOut{Input}{input}\SetKwInOut{Output}{output}
  \DontPrintSemicolon
  \Input{Two MAGs $G_1 = (V_1,E_1)$, $G_2 = (V_2, E_2).$}
  \Output{Whether $G_1$ and $G_2$ are Markov equivalent.}
  \vspace*{0.2cm}
  \If{\textit{(I)} or \textit{(II)} of the constructive-SRC is
    violated}{\label{line:naivecheck}
    \KwRet Not Markov equivalent. \;
  }
  \ForEach{$G_k = (V_k, E_k)$ with $k \in \{1,2\}$}{
    \ForEach{$y \in V_k$}{ \label{line:yiter}
      \ForEach{$D \in \mathcal{D}(y)$}{
        Compute $\Pa_{G_k}(D)$ and $\Si_{G_k}(D)$. \;
        \If{$\Pa_{G_k}(D) \cup \Si_{G_k}(D) \setminus
          \Ne_{G_k}(y) \not= \emptyset$}{ \label{line:xexists}
          \ForEach{$b \in \Si_{G_k}(D) \cap \Si_{G_k}(y)$}{
            Let $G_{k'}$ be the other graph, i.\,e., $k' = 3 - k$. \;
            \If{$b \rightarrow y$ in $G_{k'}$}{\label{line:othergraphcheck}
            \KwRet Not Markov
              equivalent.} \label{line:not:eq}
          } 
        }
      }
    }
  }
  \KwRet Markov equivalent. \; \label{line:eq}  
\end{algorithm}


We conclude that for graphs with maximal degree~$\Delta$ the expected runtime can be written as $O(n \cdot \Delta^2)$, which
is linear in the number of vertices for a constant $\Delta$. We note
that this is a significant
improvement over~\citet{hu2020faster}, who reported time $O(m^2) = O(n^2)$ for
sparse random graphs.
% --------- Remove this?
% A main difference (also in practice, see
% Section~\ref{section:experiments}) between those algorithm is
% that, with the constructive-SRC, it is possible to consider only the
% parents of $y$ (and their parents/siblings),
% while~\citet{hu2020faster} need to consider all ancestors of a pair (or
% a triple) of vertices at certain points of their algorithm.
% -----------------



\section{A Different Approach to Markov Equivalence Testing}\label{sec:different:approachs}
The continued improvement of algorithms for testing the Markov
equivalence of MAGs from exponential time (SRC) over $O(n^9)$ (\citet{ali2009markov}) 
and $O(n^5)$ (\citet{hu2020faster}) to
$O(n^3)$ begs the
question of what the best achievable runtime is. Is it possible to test Markov
equivalence of MAGs in $O(n^2)$?
%\footnote{This would be the
%   asympotically optimal runtime as every graph can consist of $n^2$
%   edges, i.\,e., might have size $O(n^2)$.}
%
%Currently, it is at least not possible to rule
%this out.
A natural comparison is the one to the Markov equivalence of
DAGs. Here, the \emph{na\"{i}ve} test of Theorem\ref{theorem:classical:criterion:DAGs}
can be done in $O(n^3)$: List all triples that are
unshielded colliders.
%\footnote{Checking condition (a), i.\,e., whether the graphs have the
%same adjacencies is trivial}.
This approach cannot lead to faster algorithms, as we may
have $\Omega(n^3)$ unshielded colliders (this obstacle
exists for MAGs as well). This indicates that a whole new approach is necessary.

For DAGs, such an approach is possible by utilizing \emph{completed partially
  directed acyclic graph} (CPDAGs)~\citep{andersson1997markov}. % was
                                % recently given by~\citet{WienobstExtendability2021}.
A CPDAG is a compact and unique
representation of a Markov equivalence class. To test whether two
DAGs are Markov equivalent, one may compute the
corresponding CPDAGs $C_1$ and $C_2$ and  check
whether $C_1=C_2$.
The complexity of this approach hinges on the complexity of converting
DAGs to CPDAGs. There are two algorithmic strategies for
this task: The first one imitates the PC algorithm for learning the
  CPDAG from observational
  data~\citep{spirtes2000causation}. %,kalisch2007estimating}. 
  First,
  initialize $C$ as the skeleton of
  $D$. Second, set all v-structures of $D$ in $C$. Third, orient
  further edges by repeated application of the first three Meek
  rules~\citep{Meek1995}. The second strategy  constructs the CPDAG from $D$
  based on a topological ordering of $D$ while utilizing
  characterizations of CPDAGs and Markov equivalence classes of
  DAGs~\citep{andersson1997markov}. This approach was used by~\citep{chickering1995transformational}, who proposed a clever linear-time (i.\,e.,
  $O(n+m)$) algorithm for the DAG-to-CPDAG task.

Hence, based on the second approach, testing Markov equivalence of
DAGs can be done in linear time
$O(n+m)$.
\footnote{The runtime of
  the first approach depends on the complexity of
  orienting the graph with the Meek rules. \citet{WienobstExtendability2021} showed that it is
   possible to perform this step in $O(n^3)$.}

Coming back to MAGs, we note that the
first approach for DAGs can be used as
well. For a MAG $G$, one can imitate the FCI
algorithm~\citep{spirtes2000causation}, which is
the counterpart of the PC algorithm under latent confounding/selection
bias, to obtain its corresponding \emph{partial ancestral graph}
(PAG)~\citep{zhang2008causal}, which is, analogously to the CPDAG for
DAGs, a compact and unique representation of an equivalence class. 
This is done by first initializing $P$ as the skeleton of $G$, setting the unshielded colliders according to
$G$ and, finally, applying the 10 completion rules given
by~\citet{zhang2008completeness} (see also~\cite{ali2005towards}).
This approach yields a polynomial-time algorithm for
testing Markov equivalence of MAGs, but with a rather
large polynomial:\footnote{The time is a
  polynomial of 
  order roughly $O(m^3\cdot n)$ as for every undirected
  edge we have to check whether global conditions hold
  (\citet{zhang2008completeness} briefly discuss the runtime,
  mentioning $O(n\cdot m)$ for checking the fourth rule for a single edge.} One can compute the PAGs $P_1$, $P_2$ for the given
MAGs $G_1$, $G_2$ and check whether they are
identical\footnote{This strategy has some parallels
  to~\cite{ali2009markov}, due to the fact that colliders with order
  also play a key role in the completeness of the FCI rules~\cite{ali2005towards}.}. 

The second strategy currently cannot be translated to MAGs as there is no counterpart for the DAG-to-CPDAG algorithm to
directly transform a MAG into a PAG. Hence, a better understanding of
PAGs might be needed for further progress and we deem this as an important
topic for future research.

%
\begin{figure*}[!h]
  \subexperiment{sparse}{3n}
  \begin{minipage}{0.49\textwidth}
    \centering\small
    \begin{tabular}{cccccc}
          & \hbox to 0pt{\hss\hspace{2cm} Algorithm \textsc{he}\hss} &             & \hbox to 0pt{\hss\hspace{1.75cm} Algorithm \textsc{c-src}\hss} &             \\[0.25ex]
      $n$ & Avg.\ Time                                               & Std.\ Dev.\ & Avg.\ Time                                                   & Std.\ Dev.\ \\[1ex]
      250  & 0.0487s                                                  & 0.0101      & 0.0015s                                                      & 0.0009      \\
      500  & 0.1058s                                                  & 0.0388      & 0.0032s                                                      & 0.0051      \\
      750  & 0.1605s                                                  & 0.0279      & 0.0049s                                                      & 0.0065      \\
      1000 & 0.2587s                                                  & 0.0594      & 0.0062s                                                      & 0.0058      \\
      1250 & 0.3579s                                                  & 0.0684      & 0.0085s                                                      & 0.0081      \\
      1500 & 0.4629s                                                  & 0.0789      & 0.0091s                                                      & 0.0058      \\
      1750 & 0.5373s                                                  & 0.0626      & 0.0106s                                                      & 0.0021      \\
      2000 & 0.6794s                                                  & 0.0778      & 0.0119s                                                      & 0.0024     \\
    \end{tabular}
  \end{minipage}
  \begin{minipage}{0.49\textwidth}
    \centering
    \begin{tikzpicture}[xscale=0.8,yscale=1]
      \instance{1}{0.04869104013600004}{0.001498422928000001}
      \instance{2}{0.10575651010000002}{0.003174151063999999}
      \instance{3}{0.16045731381599992}{0.004898826199999999}
      \instance{4}{0.258739845244}{0.006242116452000001}
      \instance{5}{0.3578947155760001}{0.008520572143999998}
      \instance{6}{0.46292616947200016}{0.009160666527999996}
      \instance{7}{0.5372881295360001}{0.010644908571999996}
      \instance{8}{0.6793816137919997}{0.011866126704}              
      
      \advantage{1}{0.04869104013600004}{0.001498422928000001} 
      \advantage{2}{0.10575651010000002}{0.003174151063999999} 
      \advantage{3}{0.16045731381599992}{0.004898826199999999} 
      \advantage{4}{0.258739845244}{0.006242116452000001}      
      \advantage{5}{0.3578947155760001}{0.008520572143999998}  
      \advantage{6}{0.46292616947200016}{0.009160666527999996} 
      \advantage{7}{0.5372881295360001}{0.010644908571999996}  
      \advantage{8}{0.6793816137919997}{0.011866126704}        
    
      \leftcaption
      \topcaption{3}{1/250, 2/500, 3/750, 4/1000, 5/1250, 6/1500, 7/1750, 8/2000}    
      \axis{9}
    \end{tikzpicture}
  \end{minipage}
  
 \bigskip \bigskip

  \subexperiment{dense}{10n}
  \begin{minipage}{0.49\textwidth}
    \centering\small
    \begin{tabular}{cccccc}
          & \hbox to 0pt{\hss\hspace{2cm} Algorithm \textsc{he}\hss} &             & \hbox to 0pt{\hss\hspace{1.75cm} Algorithm \textsc{c-src}\hss} &             \\[0.25ex]
      $n$ & Avg.\ Time                                               & Std.\ Dev.\ & Avg.\ Time                                                     & Std.\ Dev.\ \\[1ex]
      25  & 0.0011s                                                   & 0.0007      & 0.0004s                                                         & 0.0004      \\
      50  & 0.0169s                                                   & 0.0028      & 0.0028s                                                         & 0.0007      \\
      75  & 0.0912s                                                   & 0.0246      & 0.0092s                                                         & 0.0014      \\
      100 & 0.4004s                                                   & 0.1037      & 0.0263s                                                         & 0.0129      \\
      125 & 1.0339s                                                   & 0.2283      & 0.0466s                                                         & 0.0091      \\
      150 & 2.3356s                                                   & 0.4349      & 0.0808s                                                         & 0.0084      \\
      175 & 4.7182s                                                   & 0.8741      & 0.1303s                                                         & 0.0106      \\
      200 & 8.9285s                                                   & 1.5242      & 0.2033s                                                         & 0.0129      \\
    \end{tabular}
  \end{minipage}
  \begin{minipage}{0.49\textwidth}
    \begin{tikzpicture}[xscale=0.8,yscale=1]
      \begin{scope}[yscale=0.225]
        \instance{1}{0.001084481276}{0.00041002779600000005}
        \instance{2}{0.016904815084}{0.0027677236519999993}
        \instance{3}{0.09179040754000001}{0.009204533775999998}
        \instance{4}{0.40043582177599985}{0.02625787479199999}
        \instance{5}{1.0338860302719999}{0.04664694562399997}
        \instance{6}{2.3356289036999986}{0.08077340449200003}
        \instance{7}{4.718184675896001}{0.13029921138000003}
        \instance{8}{8.92854898946001}{0.20326768029999992}           
      \end{scope}

      \advantage{1}{0.001084481276}{0.00041002779600000005}     
      \advantage{2}{0.016904815084}{0.0027677236519999993}      
      \advantage{3}{0.09179040754000001}{0.009204533775999998}  
      \advantage{4}{0.40043582177599985}{0.02625787479199999}   
      \advantage{5}{1.0338860302719999}{0.04664694562399997}    
      \advantage{6}{2.3356289036999986}{0.08077340449200003}    
      \advantage{7}{4.718184675896001}{0.13029921138000003}     
      \advantage{8}{8.92854898946001}{0.20326768029999992}      
      
      \leftcaption
      \topcaption{3}{1/25, 2/50, 3/75, 4/100, 5/125, 6/150, 7/175, 8/200}    
      \axis{9}
    \end{tikzpicture}
  \end{minipage}

  
  \caption{\emph{Advantage plots} that compare our implementation
    (\textsc{c-src}) with the algorithm by Hu and Evans
    (\textsc{he}). Each bar corresponds to an experiment on random
    graphs with $n$ vertices (denoted above the bars) and $k = 3n$
    (top image) or $k = 10n$ (bottom image) edges, respectively. For each
    experiment we measured the average time needed by both
    algorithms over 250 instances. If \textsc{c-src} uses $t_1$ seconds and \textsc{he} took
    $t_2$ seconds, then the \emph{advantage} of \textsc{c-src} over
    \textsc{he} is defined by $t_2-t_1$ (i.\,e., the advantage is
    positive iff \textsc{c-src} is faster). The advantage (in seconds)
    is shown below the bars.}
  \label{figure:experiments}
\end{figure*}

\section{Related Problems}
\label{section:related:problems}
So far, our focus lied on the problem of
 testing Markov
equivalence of  %maximal ancestral graph (
MAGs without undirected
edges. In this section we discuss the connection to more general
formulations of the problem.
%
First, we note that the constructive-SRC
and Algorithm~\ref{alg:checking} also work for MAGs with undirected
edges. This is because
the SRC also holds in this setting
and that there cannot be an undirected edge in a discriminating path
(in particular, the edge between $b$ and $y$ cannot be undirected).
%
For Corollary~\ref{cor:simplified:criterion} a modification is
necessary: condition \textit{(II)} has to be changed to ``If there is
a collider path $x, \dots, b, y$ between non-adjacent $x$ and $y$ with
every vertex but $x$, $b$ and $y$ being a parent of~$y$ in one graph, then
the other graph does neither contain the edge $b \rightarrow y$ \emph{nor}
the edge $b - y$.'' This is necessary as a collider
$u \stararrow v \arrowstar w$ in one graph might correspond to a
non-collider $u - v - w$ in the other graph~--~and these graphs are, of
course, not Markov equivalent.


Further related problems are obtained by removing the maximality or
the ancestrality requirement (or both). In that case, we deal with
general \emph{acyclic directed mixed graphs} (ADMGs). These are graphs
that may contain directed and bidirected edges with
the only requirement that there is no directed cycle.
The SRC and constructive-SRC do not apply for ADMGs as they
explicitly use the maximality and ancestrality properties. However, one can transform ADMGs
into equivalent MAGs and, thus, test the Markov equivalence of ADMGs
using the algorithms for MAGs. As it turns out, the currently fasted
algorithm for the ADMG-to-MAG transformation (Algorithm~2
in~\citet{hu2020faster}) requires time $O(n^4)$ and is, thus, the
bottleneck in this approach (testing the equivalence of MAGs is in
$O(n^3)$ by Theorem~\ref{theorem:algorithm}).


It is unclear to us whether this transformation can be performed in
$O(n^3)$. A central part of it involves the computation of so-called \emph{inducing
  paths}, 
where it has to be checked for every pair of vertices $(x,y)$
whether there is a collider path between $x$ and $y$ via vertices in
$\An(x,y)$. 
Since we have $O(n^2)$ such pairs and
since, further,
graph traversal is in $\Omega(n+m)$, this direct approach necessarily
produces a workload of $O(m\cdot n^2)$. We believe that it will be
central for the developement of faster ADMGs equivalence tests to better
understand the complexity of ADMG-to-MAG and
consider this as an interesting question for further work.


\section{Experiments} \label{section:experiments}
To emphasize the practical effectiveness of the constructive-SRC and,
in particular, Algorithm~\ref{alg:checking}, we compare it experimentally
with the algorithm proposed
by~\citet{hu2020faster} on synthetic data. Both algorithms were implemented in the Julia
programming language~\citep{bezanson2017julia} and we ran the
experiments on a desktop computer with an Intel(R) Core(TM) i7-8565U
CPU and 16GBs of RAM.\footnote{The code is available under: \url{https://github.com/mwien/magequivalence}}
Synthetic MAGs were generated with the process described
in~\citep{hu2020faster}: Fix a
topological ordering $\tau$ of the vertices, then add $k$ edges uniformly at
random, and finally direct each edge with probability $1/2$ according to~$\tau$. 
Replacing remaining undirected edges with bidirected edges yields an ADMG,
which can in turn be transformed into a MAG, as discussed in Section~\ref{section:related:problems}.


For a fair comparison with the experiments in~\citep{hu2020faster}, we
run a modified version of
Algorithm~\ref{alg:checking}. It generates, for \emph{a single MAG}, a set
of all adjacencies $A$ (for checking \textit{I}), a set of all v-structures $V$ (for
checking \textit{II}), and the set $C$ of all $b \bidirected y$ that are part of
a discriminating path, as well as the set $N$ of all $b \rightarrow y$ (for checking \textit{III}). Clearly, if one were to generate these
sets for two MAGs $G_1$ and $G_2$, testing %Markov 
equivalence would
reduce to %the simple task of 
checking whether $A_1 = A_2$, $V_1 =
V_2$, $C_1 \cap N_2 = \emptyset$, and $C_2 \cap N_1 = \emptyset$.
\begin{table} [t] %tpb]
  \caption{Distribution of directed edges (\!$\rightarrow$\!) and
    bidirected edges (\!$\leftrightarrow$\!) in the randomly generated ADMGs
    and in the corresponding MAGs. For every row we generated
    250 random ADMGs with $n$ vertices and $k$ edges,
    and show the average of directed or bidirected edges they contain.}
  \label{table:average-edges}
  \scriptsize
  \setlength\tabcolsep{3.25pt}
  \begin{tabular}{rr rrr rrr}
               &           &                     & \hbox to 0pt{\hss\small\hspace{-.5cm} ADMG\hss} &                                         &                     & \hbox to 0pt{\hss\small\hspace{-.5cm} MAG\hss} &                                        \\
    \small $n$ & \small$k$ & \multicolumn{1}{c}{\small$\rightarrow$} &  \multicolumn{1}{c}{\small$\leftrightarrow$}                         &  \multicolumn{1}{r}{\normalsize$\sfrac{\small\rightarrow\!}{\!\small\leftrightarrow\!}$} &  \multicolumn{1}{c}{\small$\rightarrow$} &  \multicolumn{1}{c}{\small$\leftrightarrow$ }                       &  \multicolumn{1}{r}{\normalsize$\sfrac{\small\rightarrow\!}{\!\small\leftrightarrow\!}$} \\[1ex]
    250        & 3$n$        & 373.844             & 376.156                                         & 0.9962                                  & 394.38              & 401.88                                         & 0.9840                                 \\
    500        & 3$n$        & 752.536             & 747.464                                         & 1.0081                                  & 772.148             & 771.456                                        & 1.0022                                 \\
    750        & 3$n$        & 1125.812            & 1124.188                                        & 1.0023                                  & 1146.012            & 1147.944                                       & 0.9991                                 \\
    1000       & 3$n$        & 1500.308            & 1499.692                                        & 1.0010                                  & 1519.616            & 1523.628                                       & 0.9980                                 \\
    1250       & 3$n$        & 1873.564            & 1876.436                                        & 0.9990                                  & 1892.58             & 1899.968                                       & 0.9966                                 \\
    1500       & 3$n$        & 2251.344            & 2248.656                                        & 1.0016                                  & 2270.228            & 2272.872                                       & 0.9992                                 \\
    1750       & 3$n$        & 2627.564            & 2622.436                                        & 1.0023                                  & 2647.156            & 2646.568                                       & 1.0005                                 \\
    2000       & 3$n$        & 3002.328            & 2997.672                                        & 1.0018                                  & 3021.66             & 3021.5                                         & 1.0003                                 \\
    25         & 10$n$       & 124.392             & 125.608                                         & 0.9984                                  & 246.432             & 49.544                                         & 5.4013                                 \\
    50         & 10$n$       & 250.912             & 249.088                                         & 1.0113                                  & 824.804             & 306.212                                        & 2.7741                                 \\
    75         & 10$n$       & 374.312             & 375.688                                         & 0.9989                                  & 1639.868            & 796.848                                        & 2.0995                                 \\
    100        & 10$n$       & 498.536             & 501.464                                         & 0.9962                                  & 2684.468            & 1501.252                                       & 1.8099                                 \\
    125        & 10$n$       & 625.692             & 624.308                                         & 1.0038                                  & 3902.696            & 2456.212                                       & 1.6067                                 \\
    150        & 10$n$       & 749.684             & 750.316                                         & 1.0006                                  & 5314.2              & 3632.024                                       & 1.4762                                 \\
    175        & 10$n$       & 874.788             & 875.212                                         & 1.0007                                  & 6926.444            & 4999.7                                         & 1.3952                                 \\
    200        & 10$n$       & 1002.088            & 997.912                                         & 1.0051                                  & 8662.34             & 6582.824                                       & 1.3236                                 \\
  \end{tabular}
\end{table}


This approach provides a finer control over the experiments as it avoids
the possibility of an ``early stopping'' at line~\ref{line:naivecheck}
or line~\ref{line:not:eq}
of Algorithm~\ref{alg:checking} (which can happen if the given MAGs are not equivalent and
would give an unfair advantage to our algorithm).
It also enables us to consider
single MAGs, which are simpler to generate randomly than, e.\,g, two
random Markov equivalent MAGs. The reported runtimes can be viewed,
for our algorithm as well as for~\citep{hu2020faster}, as essentially half the
time occurring when two Markov equivalent DAGs are compared (because
these steps have to be performed for both graphs).

For the choice of the parameter $k$ (the number of edges), we follow,
on the one hand, \citet{hu2020faster} and set it to $k = 3n$ (see
the top part of Fig.~\ref{figure:experiments}) and, on the other hand, also consider denser
graphs with $k = 10n$ (bottom part of~Fig.~\ref{figure:experiments}). Note that $k$ is the number of edges
in the generated ADMG and not in the MAGs on which the algorithms
run. The transformation of an ADMG into a MAG might generate new edges, see
Table~\ref{table:average-edges}. For $k = 3n$, one usually only sees a small increase, while a significant
amount of edges is added for $k=10n$. The proportion of directed and
bidirected edges also changes in the latter case, the graphs
usually contain more directed edges than bidirected ones.
%
For our experiments, we ran both algorithms on the same 250 randomly
generated graphs for each choice of parameters and report the average
time they used in Fig.~\ref{figure:experiments}.






It can be seen that Algorithm~\ref{alg:checking} is faster for all
choices of $n$ and $k$. We can also observe that for ever larger graphs, the
advantage %over the algorithm by~\citet{hu2020faster}
increases~--~which implies that the algorithm in fact has a better
asymptotic behaviour. This phenomenon becomes even more significant
on the dense (and, thus, more difficult) instances. Finally, the
absolute runtime of Algorithm~\ref{alg:checking} is generally
extremely low (for the considered inputs only fractions of a
second\footnote{Generating significantly harder instances is not a
  trivial task as the random generation process relies on the
  ADMG-to-MAG task, which currently cannot be performed faster than in
  $O(n^4)$.}).





\section{Conclusions}
We proposed the constructive-SRC -- a new %graphical
criterion for~the Markov
equivalence of MAGs. It is expressed in terms of natural graphical concepts, 
can easily be tested %checked 
by hand for smaller graphs,
and leads to the first cubic-time algorithm. 
%Markov equivalence test for MAGs. 

For further work, it remains an open problem whether the
runtime can be reduced to $O(n^2)$, as is possible for DAGs. We
argued that a different approach is necessary, as any
approach that explicitly considers all unshielded colliders has a
complexity of $\Omega(n^3)$. 
Generally, a better understanding of Markov equivalence classes of
MAGs may facilitate the translation of further research from the DAG
setting, e.\,g., regarding active learning~\citep{hauser2012characterization} or the question of computing
the size of Markov equivalence
classes~\citep{wienobst2021polynomial}, which could add to recent results
in this
direction~\citep{kocaoglu2019characterization,wang2021actively}. 
%
Finally, due to the improved runtime for equivalence testing of
MAGs, the ADMG-to-MAG transformation is currently the
bottleneck for the problem on acyclic mixed graphs, making the design
of a faster transformation algorithm an
important task for further work.

\newpage
\bibliography{wienobst_482}

% TODO: acknowledgements?
% \begin{acknowledgements}
% ...
% \end{acknowledgements}

\end{document}
