%\documentclass{uai2022} % for initial submission
\documentclass[accepted]{uai2022} % after acceptance, for a revised
% version; also before submission to
% see how the non-anonymous paper
% would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2022} % ptmx math instead of Computer
% Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2022} % newtx fonts (improves upon
% ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
\bibliographystyle{plainnat}
\renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams


%\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}

\usepackage{nicefrac}

\usepackage{enumitem}


\newtheorem{lemma}{Lemma}
\newtheorem{proposition}{Proposition}
\newtheorem{corollary}{Corollary}
\newtheorem{theorem}{Theorem}
\newtheorem{sketch}{Sketch}


\theoremstyle{definition}
\newtheorem{definition}{Definition}
\newtheorem{assumption}{Assumption}


%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros

%% Self-defined macros
\newcommand{\reals}{\mathbb{R}}
\newcommand{\realsnonneg}{\mathbb{R}_{\geq0}}
\newcommand{\realspos}{\mathbb{R}_{>0}}
\newcommand{\nats}{\mathbb{N}}
\newcommand{\natswith}{\mathbb{N}_0}

\newcommand{\coloneqq}{:=}

\newcommand{\states}{\mathcal{X}}
\newcommand{\gamblesX}{\smash{\reals^\states}}
\newcommand{\gamblesY}[1]{\smash{\reals^{#1}}}
\newcommand{\gamblesAc}{\smash{\reals^{A^c}}}

\newcommand{\ones}{\mathbf{1}}

\newcommand{\abs}[1]{\left\lvert #1 \right\rvert}
\newcommand{\norm}[1]{\left\lVert #1 \right\rVert}
\newcommand{\inftynorm}[1]{\left\lVert #1 \right\rVert_\infty}

\newcommand{\resAc}{\vert_{A^c}}
\newcommand{\upX}{\!\!\uparrow_\mathcal{X}}

\newcommand{\subgen}{G}
\newcommand{\lsubgen}{\underline{G}}
\newcommand{\usubgen}{\overline{G}}
\newcommand{\rate}{Q}
\newcommand{\rateset}{\mathcal{Q}}
\newcommand{\lrate}{\underline{Q}}
\newcommand{\ltrans}{\underline{P}}
\newcommand{\urate}{\overline{Q}}

\newcommand{\ind}[1]{\mathbb{I}_{#1}}

\newcommand{\derivt}{\frac{\mathrm{d}}{\mathrm{d}\,t}}

\newcommand{\rateforp}{{}^P\!Q}


\title{Hitting Times for Continuous-Time Imprecise-Markov Chains}

% The standard author block has changed for UAI 2022 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
\author[1]{\href{mailto:<t.e.krak@tue.nl>?Subject=Your UAI 2022 paper}{Thomas~Krak}{}}
% Add affiliations after the authors
\affil[1]{%
	Department of Mathematics and Computer Science\\
	Eindhoven University of Technology\\
	Eindhoven, The Netherlands
}

\begin{document}
\maketitle

\begin{abstract}
	We study the problem of characterizing the expected hitting times for a robust generalization of continuous-time Markov chains. This generalization is based on the theory of imprecise probabilities, and the models with which we work essentially constitute sets of stochastic processes. Their inferences are tight lower- and upper bounds with respect to variation within these sets. 
	
	We consider three distinct types of these models, corresponding to different levels of generality and structural independence assumptions on the constituent processes. 
	
	Our main results are twofold; first, we demonstrate that the hitting times for all three types are equivalent. Moreover, we show that these inferences are described by a straightforward generalization of a well-known linear system of equations that characterizes expected hitting times for traditional time-homogeneous continuous-time Markov chains.
\end{abstract}

\section{Introduction}\label{sec:intro}

We consider the problem of characterizing the \emph{expected hitting times} for continuous-time \emph{imprecise-Markov chains}~\citep{skulj2015efficient,krak2017imprecise,krak2021phd,erreygers2021phd}. These are \emph{robust}, set-valued generalizations of (traditional) Markov chains~\citep{norris1998markov}, based on the theory of \emph{imprecise probabilities}~\citep{Walley:1991vk,augustin2013:itip}. From a sensitivity-analysis perspective, we may interpret these sets as hedging against model-uncertainties with respect to a model's numerical parameters and/or structural (independence) assumptions.

The inference problem of hitting times essentially deals with the question of how long it will take the underlying system to reach some particular subset of its states. This is a common and important problem in such fields as, e.g., reliability analysis, where it can capture the expected time-to-failure of a system; and epidemiology, to model the expected time-until-extinction of an epidemic. For imprecise-Markov chains, then, we are interested in evaluating these quantities in a manner that is robust against, and conservative with respect to, any variation that is compatible with one's uncertainty about the model specification.

\citet{erreygers2021phd} has recently obtained some partial results towards characterizing such inferences, but has not been able to give a complete characterization and has largely studied the finite-time horizon case. The problem of hitting times for \emph{discrete-time} imprecise-Markov chains was previously studied by~\citet{krak2019hitting,krak2020computing}. In this present work, we largely emulate and extend their results to the continuous-time setting.

We will be concerned with three different types of imprecise-Markov chains. These are all sets of stochastic processes that are in a specific sense compatible with a given set of numerical parameters, but the three types differ in the independence properties of their elements. In particular, they correspond to (i) a set of (\emph{time-})\emph{homogeneous} Markov chains, (ii) a set of (not-necessarily homogeneous) Markov chains, and (iii) a set of general---not-necessarily homogeneous nor Markovian---stochastic processes. It is known (and perhaps not very surprising) that inferences with respect to these three models do not in general agree; see e.g.~\citep{krak2021phd} for a detailed analysis of their differences.

However, our first main result in this work is that the expected hitting time is \emph{the same} for these three different types of models. Besides being of theoretical interest, we want to emphasize the power of this result: it means that even if a practitioner using Markov chains would be uncertain whether the system they are studying is truly homogeneous and/or Markovian, relaxing these assumptions would not influence inferences about the hitting times in this sense. Purely pragmatically, it also means that we can use computational methods tailored to any one of these types of models, to compute these inferences. 

Our second main result is that these hitting times are characterized by a generalization of a well-known system of equations that holds for continuous-time homogeneous Markov chains; see Proposition~\ref{prop:precise_cont_system} for this linear system.

The remainder of this paper is structured as follows. In Section~\ref{sec:prelim} we introduce the basic required concepts that we will use throughout, formalizing the notion of stochastic processes and defining the inference problem of interest. In Section~\ref{sec:imp_markov}, we define the various types of imprecise-Markov chains that we use throughout this work. We spend some effort in Section~\ref{subsec:subspace_dynamics} to study the transition dynamics of these models, from a perspective that is particularly relevant for the inference problem of hitting times. In Section~\ref{sec:hits_as_limits} we explain and sketch the proofs of our main results, and we give a summary in Section~\ref{sec:summary}. 


Because we have quite a lot of conceptual material to cover before we can explain our main results, we are not able to fit any real proofs in the main body of this work. Instead, these---together with a number of technical lemmas---have largely been relegated to the supplementary material.

\section{Preliminaries}\label{sec:prelim}

Throughout, we consider a fixed, finite \emph{state space} $\states$ with at least two elements. This set contains all possible values for some abstract underlying process. An element of $\mathcal{X}$ is called a \emph{state}, and is usually generically denoted as $x\in\mathcal{X}$.

We use $\reals, \realsnonneg$, and $\realspos$ to denote the reals, the non-negative reals, and the positive reals, respectively. $\nats$ denotes the natural numbers \emph{without} zero, and we let $\natswith\coloneqq\nats\cup\{0\}$. 

For any $\mathcal{Y}\subseteq\states$, we use $\gamblesY{\mathcal{Y}}$ to denote the vector space of real-valued functions on $\mathcal{Y}$; in particular, $\gamblesX$ denotes the space of all real functions on $\states$. We use $\norm{\cdot}$ to denote the supremum norm on any such space; for any $f\in\gamblesY{\mathcal{Y}}$ we let $\norm{f}\coloneqq \max\{\abs{f(x)}\,:\,x\in\mathcal{Y}\}$. Throughout, we make extensive use of \emph{indicator functions}, which are defined for all $A\subseteq \mathcal{Y}$ as $\ind{A}(x)\coloneqq 1$ if $x\in A$ and $\ind{A}(x)\coloneqq 0$, otherwise. We use the shorthand $\ind{y}\coloneqq\ind{\{y\}}$. Let $\ones$ denote the function that is identically equal to $1$; its dimensionality is to be understood from context. 

A map $M:\gamblesY{\mathcal{Y}}\to\gamblesY{\mathcal{Y}}$ is also called an \emph{operator}, and we denote its evaluation in $f\in\gamblesY{\mathcal{Y}}$ as $Mf$. If it holds for all $\lambda\in\realsnonneg$ that $M(\lambda f)=\lambda Mf$ then $M$ is called \emph{non-negatively homogeneous}. For any non-negatively homogeneous operator on $\gamblesY{\mathcal{Y}}$, we define the induced operator norm $\norm{M}\coloneqq \sup\{\norm{Mf}\,:\,f\in\gamblesY{\mathcal{Y}},\norm{f}=1\}$. We reserve the symbol $I$ to denote the identity operator on any space; the domain is to be understood from context.

Note that any \emph{linear} operator is also non-negatively homogeneous. Moreover, if $M$ is linear it can be represented as an $\abs{\mathcal{Y}}\times\abs{\mathcal{Y}}$ matrix by arbitrarily fixing an ordering on $\mathcal{Y}$. However, without fixing such an ordering, we simply use $M(x,y)\coloneqq M\ind{y}(x)$ to denote the entry in the $x$-row and $y$-column of such a matrix, for any $x,y\in\mathcal{Y}$. For any $f\in\gamblesY{\mathcal{Y}}$ and $x\in\mathcal{Y}$ we then have $Mf(x)=\sum_{y\in\mathcal{Y}}M(x,y)f(y)$, so that $Mf$ simply represents the usual matrix-vector product of $M$ with the (column) vector $f$. In the sequel, we interchangeably refer to linear operators also as matrices. We note the well-known equality $\norm{M}=\max_{x\in\mathcal{Y}}\sum_{y\in\mathcal{Y}}\abs{M(x,y)}$ for the induced matrix norm.

\subsection{Processes \& Markov Chains}

We now turn to stochastic processes, which are fundamentally the subject of this work. The typical (measure-theoretic) way to define a stochastic process is simply as a family $(X_i)_{i\in\mathcal{I}}$ of random variables with index set $\mathcal{I}$. This index set represents the time domain of the stochastic process. The random variables are understood to be taken with respect to some underlying probability space $(\Omega_{\mathcal{I}},\mathcal{F}_{\mathcal{I}},P)$, where $\Omega_\mathcal{I}$ is a set of \emph{sample paths}, which are functions from $\mathcal{I}$ to $\states$ representing possible realizations of the evolution of the underlying process through $\states$. The random variables $X_i$, $i\in\mathcal{I}$ are canonically the maps $X_i:\omega\mapsto \omega(i)$ on $\Omega_{\mathcal{I}}$. 

However, for our purposes it will be more convenient to instead refer to the \emph{probability measure} $P$ as the stochastic process. Different processes $P$ may then be taken over the same measurable space $(\Omega_\mathcal{I},\mathcal{F}_\mathcal{I})$, using the same canonical variables $(X_i)_{i\in\mathcal{I}}$ for all these processes.

In this work we will use both \emph{discrete}- and \emph{continuous}-time stochastic processes, which corresponds to choosing $\mathcal{I}=\natswith$ or $\mathcal{I}=\realsnonneg$, respectively. In both cases we take $\mathcal{F}_\mathcal{I}$ to be the $\sigma$-algebra generated by the cylinder sets; this ensures that all functions that we consider are measurable. 

In the discrete-time case, we let $\Omega_{\natswith}$ be the set of \emph{all} functions from $\natswith$ to $\states$. A discrete-time stochastic process $P$ is then simply a probability measure on $(\Omega_{\natswith},\mathcal{F}_\mathbf{\natswith})$. Moreover, $P$ is said to be a \emph{Markov chain} if it satisfies the (discrete-time) \emph{Markov property}, meaning that
\begin{align*}
	P(X_{n+1}=x_{n+1}\,&\vert\,X_0=x_0,\ldots,X_{n}=x_n) \\
	&= P(X_{n+1}=x_{n+1}\,\vert\,X_n=x_{n})\,,
\end{align*}
for all $x_0,\ldots,x_{n+1}\in\states$ and $n\in\natswith$. If, additionally, it holds for all $x,y\in\states$ and $n\in\natswith$ that
\begin{equation*}
	P(X_{n+1}=y\,\vert\,X_n=x)=P(X_1=y\,\vert\,X_0=x)\,,
\end{equation*}
then $P$ is said to be a \mbox{(\emph{time-})\emph{homogeneous}} Markov chain. We use $\mathbb{P}_{\natswith}, \mathbb{P}_{\natswith}^{\mathrm{M}}$, and $\mathbb{P}_{\natswith}^{\mathrm{HM}}$ to denote, respectively, the set of \emph{all} discrete-time stochastic processes; the set of all discrete-time Markov chains; and the set of all discrete-time homogeneous Markov chains.


In the continuous-time case, we let $\Omega_{\realsnonneg}$ be the set of all \emph{cadlag} functions from $\realsnonneg$ to $\states$. A continuous-time stochastic process $P$ is a probability measure on $(\Omega_{\realsnonneg},\mathcal{F}_{\realsnonneg})$. The process $P$ is said to be a Markov chain if it satisfies the (continuous-time) Markov property,
\begin{align*}
	P(X_{t_{n+1}}=x_{t_{n+1}}\,&\vert\,X_{t_0}=x_{t_0},\ldots,X_{t_n}=x_{t_n}) \\
	&= P(X_{t_{n+1}}=x_{t_{n+1}}\,\vert\,X_{t_n}=x_{t_n})
\end{align*}
for all $x_{t_0},\ldots,x_{t_{n+1}}\in\states$, $t_0<\cdots< t_n\leq t_{n+1}\in\realsnonneg$, and all $n\in\natswith$. If, additionally, it holds that
\begin{equation*}
	P(X_s=y\,\vert\,X_t=x) = P(X_{s-t}=y\,\vert\,X_0=x)
\end{equation*}
for all $x,y\in\states$ and all $t,s\in\realsnonneg$ with $t\leq s$, then $P$ is said to be a \mbox{(time-)homogeneous} Markov chain. 
We use $\mathbb{P}_{\realsnonneg}, \mathbb{P}_{\realsnonneg}^{\mathrm{M}}$, and $\mathbb{P}_{\realsnonneg}^{\mathrm{HM}}$ to denote, respectively, the set of \emph{all} continuous-time stochastic processes; the set of all continuous-time Markov chains; and the set of all continuous-time homogeneous Markov chains.

We refer to~\citep{norris1998markov} for an excellent further introduction to discrete-time and continuous-time Markov chains.


\subsection{Transition Dynamics}\label{subsec:precise_dynamics}

Throughout this work, we make extensive use of operator-theoretic representations of the behavior of stochastic processes, and Markov chains in particular. The first reason for this is that such operators serve as a way to parameterize Markov chains. Moreover, they are also useful as a \emph{computational} tool, since they can often be used to express inferences of interest; see, e.g., Propositions~\ref{prop:precise_discr_system} and~\ref{prop:precise_cont_system} further on. We introduce the basic concepts below, and refer to e.g.~\citep{norris1998markov} for details.

A \emph{transition matrix} $T$ is a linear operator on $\gamblesX$ such that, for all $x\in\states$, it holds that $T(x,y)\geq 0$ for all $y\in\states$, and $\sum_{y\in\states}T(x,y)=1$. There is an important and well-known connection between Markov chains and transition matrices; for any discrete-time homogeneous Markov chain $P$, we can define the \emph{corresponding transition matrix} ${}^{P}T$ as
\begin{equation*}
	{}^{P}T(x,y)\coloneqq P(X_{1}=y\,\vert\,X_0=x)\quad\text{for all $x,y\in\states$.}
\end{equation*}
Since $P$ is a probability measure, we clearly have that ${}^{P}T$ is a transition matrix. Conversely, a given transition matrix $T$ uniquely determines a discrete-time homogeneous Markov chain $P$ with ${}^{P}T=T$, up to the specification of the initial distribution $P(X_0)$. For this reason, transition matrices are often taken as a crucial parameter to specify (discrete-time, homogeneous) Markov chains.

Analogously, for a (non-homogeneous) discrete-time Markov chain $P$, we might define a family $({}^PT_n)_{n\in\natswith}$ of \emph{time-dependent} corresponding transition matrices, with
\begin{equation*}
	{}^{P}T_n(x,y)\coloneqq P(X_{n+1}=y\,\vert\,X_n=x)\,,
\end{equation*}
for all $x,y\in\states$ and $n\in\natswith$. Conversely, any family $(T_n)_{n\in\natswith}$ of transition matrices uniquely determines a discrete-time Markov chain $P$ with ${}^{P}T_n=T_n$ for all $n\in\natswith$, again up to the specification of $P(X_0)$.

In the continuous-time setting, transition matrices are also of great importance. However, it will be instructive to first introduce rate matrices.
A \emph{rate matrix} $Q$ is a linear operator on $\gamblesX$ such that, for all $x\in\states$, it holds that $Q(x,y)\geq 0$ for all $y\in\states$ with $x\neq y$, and $\sum_{y\in\states}Q(x,y)=0$.

For any rate matrix $Q$ and any $t\in\realsnonneg$, the \emph{matrix exponential} $e^{Qt}$ of $Qt$ can be defined as~\citep{van2006study}
\begin{equation*}
	e^{Qt}\coloneqq \lim_{n\to+\infty}\bigl(I+\nicefrac{t}{n}Q\bigr)^n\,.
\end{equation*}
An alternative characterization is as the (unique) solution to the matrix ordinary differential equation~\citep{van2006study}
\begin{equation}\label{eq:matrix_exp_differential}
	\frac{\mathrm{d}}{\mathrm{d}\,s}e^{Qs} = Qe^{Qs}=e^{Qs}Q,\quad\text{with $e^{Q0}=I$.}
\end{equation}
For any $t,s\in\realsnonneg$ it holds that $e^{Q(t+s)}=e^{Qt}e^{Qs}$, and we immediately have $e^{Q0}=I$. The family $(e^{Qt})_{t\in\realsnonneg}$ is therefore called the \emph{semigroup} generated by~$Q$, and $Q$ is called the \emph{generator} of this semigroup. Moreover, for any rate matrix $Q$ and any $t\in\realsnonneg$, $e^{Qt}$ is a transition matrix~\citep[Thm 2.1.2]{norris1998markov}.

Now let us consider a continuous-time homogeneous Markov chain $P$, and define the corresponding transition matrix\footnote{Note that in continuous-time, we always have to measure the transition-time interval $[0,t]$ to specify these matrices.} ${}^PT_t$ for all $t\in\realsnonneg$ and $x,y\in\states$ as
\begin{equation}\label{eq:trans_mat_continuous}
	{}^PT_t(x,y) \coloneqq P(X_t = y\,\vert\,X_0=x)\,.
\end{equation}
It turns out that there is then a unique rate matrix $\rateforp$ associated with $P$ such that ${}^PT_t=e^{\rateforp t}$ for all $t\in\realsnonneg$. By combining Equations~\eqref{eq:matrix_exp_differential} and~\eqref{eq:trans_mat_continuous}, we can identify $\rateforp$ as
\begin{equation*}
	\rateforp = \Bigl(\frac{\mathrm{d}}{\mathrm{d}\,t}{}^PT_t\Bigr) \bigg\vert_{t=0}\,.
\end{equation*}
As before, in the other direction we have that any fixed rate matrix $Q$ uniquely determines a continuous-time homogeneous Markov chain $P$ with $\rateforp=Q$, up to the specification of $P(X_0)$. For this reason, rate matrices are often used to specify (continuous-time, homogeneous) Markov chains.

Let us finally consider a (not-necessarily homogeneous) continuous-time Markov chain $P$. For any $t,s\in\realsnonneg$ with $t\leq s$, we can then define a transition matrix ${}^PT_t^s$ with, for all $x,y\in\states$, ${}^PT_t^s(x,y) \coloneqq P(X_s=y\,\vert\,X_t=x)$.
Under appropriate assumptions of differentiability, this induces a family $(\rateforp_t)_{t\in\realsnonneg}$ of rate matrices $\rateforp_t$, as
\begin{equation}\label{eq:time_dependent_rate}
	\rateforp_t = \Bigl(\frac{\mathrm{d}}{\mathrm{d}\,s}{}^PT_t^s\Bigr) \bigg\vert_{s=t}\,.
\end{equation}
In the converse direction we might try to reconstruct the transition matrices of $P$ by solving the matrix ordinary differential equation(s)
\begin{equation}\label{eq:nonhomogen_diffential_form}
	\frac{\mathrm{d}}{\mathrm{d}\,s}{}^PT_t^s = {}^PT_t^s\rateforp_s,\quad\text{with ${}^PT_t^t=I$.}
\end{equation}
By comparing with Equation~\eqref{eq:matrix_exp_differential}, we see that in the special case where $\rateforp_s$ does not depend on $s$---that is, where $P$ is homogeneous with $\rateforp_s=\rateforp$, say---we indeed obtain ${}^PT_t^s=e^{\rateforp(s-t)}$. However, in general the \emph{non-autonomous} system~\eqref{eq:nonhomogen_diffential_form} does not have such a closed-form solution, and we cannot move beyond this implicit characterization.



\subsection{Hitting Times}

We now have all the pieces to introduce the inference problem that is the subject of this work, \emph{viz}. the \emph{expected hitting times} of some non-empty set of states $A\subset\states$ with respect to a particular stochastic process. We take this set $A$ to be fixed for the remainder of this work.

In the discrete-time case, we consider the (extended real-valued)\footnote{We agree that $0(+\infty)=0$; $(+\infty) + (+\infty) = +\infty$; and, for any $c\in\reals$, $(+\infty)+c=+\infty$ and $c(+\infty)=+\infty$ if $c>0$.} function $\tau_{\natswith}:\Omega_{\natswith}\to\realsnonneg\cup\{+\infty\}$ given by
\begin{equation*}
	\tau_{\natswith}(\omega) \coloneqq \inf\bigl\{n\in\natswith\,:\,\omega(n)\in A\bigr\}\quad\text{for all $\omega\in\Omega_{\natswith}$.}
\end{equation*}
This captures the number of steps before a process $P$ ``hits'' any state in $A$.
The expected hitting time for a discrete-time process $P$ starting in $x\in\states$ is then defined as
\begin{equation*}
	\mathbb{E}_P\bigl[\tau_{\natswith}\,\vert\,X_0=x\bigr] \coloneqq \int_{\Omega_{\natswith}} \tau_{\natswith}(\omega)\,\mathrm{d}P(\omega\,\vert\,X_0=x)\,.
\end{equation*}
We use $\mathbb{E}_P\bigl[\tau_{\natswith}\,\vert\,X_0\bigr]$ to denote the extended real-valued function on $\states$ given by $x\mapsto \mathbb{E}_P\bigl[\tau_{\natswith}\,\vert\,X_0=x\bigr]$. When dealing with homogeneous Markov chains, this quantity has the following simple characterization:
\begin{proposition}{\citep[Thm 1.3.5]{norris1998markov}}\label{prop:precise_discr_system}
	Let $P$ be a discrete-time homogeneous Markov chain with corresponding transition matrix ${}^{P}T$. Then $h\coloneqq \mathbb{E}_P\bigl[\tau_{\natswith}\,\vert\,X_0\bigr]$ is the minimal non-negative solution to the linear system\footnote{Throughout, for any $f,g\in\gamblesX$, the quantity $fg$ is understood as the pointwise product between the functions $f$ and $g$.}\footnote{Strictly speaking this requires extending the domain of ${}^PT$ to extended-real valued functions, but we will shortly introduce some assumptions that obviate such an exposition.}
	\begin{equation*}
		h = \ind{A^c} + \ind{A^c} {}^{P}Th\,.
	\end{equation*}
\end{proposition}
In the continuous-time case, the definition is analogous; we introduce a function $\tau_{\realsnonneg}:\Omega_{\realsnonneg}\to\realsnonneg\cup\{+\infty\}$ as
\begin{equation*}
	\tau_{\realsnonneg}(\omega) \coloneqq \inf\bigl\{t\in\realsnonneg\,:\,\omega(t)\in A\bigr\}\,\,\text{for all $\omega\in\Omega_{\realsnonneg}$.}
\end{equation*}
This function measures the time until a process ``hits'' any state in $A$ on a given sample path. The expected hitting time for a continuous-time process $P$ starting in $x\in\states$ is
\begin{equation*}
	\mathbb{E}_P\bigl[\tau_{\realsnonneg}\,\vert\,X_0=x\bigr] \coloneqq \int_{\Omega_{\realsnonneg}} \tau_{\realsnonneg}(\omega)\,\mathrm{d}P(\omega\,\vert\,X_0=x)\,.
\end{equation*}
We again use $\smash{\mathbb{E}_P\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]}$ to denote the extended-real valued function on $\states$ given by $x\mapsto \smash{\mathbb{E}_P\bigl[\tau_{\realsnonneg}\,\vert\,X_0=x\bigr]}$. Also in this case, the characterization for homogeneous Markov chains is particularly simple:
\begin{proposition}{\citep[Thm 3.3.3]{norris1998markov}}\label{prop:precise_cont_system}
	Let $P$ be a continuous-time homogeneous Markov chain with rate matrix $\smash{\rateforp}$ such that $\smash{\rateforp}(x,x)\neq 0$ for all $x\in A^c$. Then $h\coloneqq \smash{\mathbb{E}_P\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]}$ is the minimal non-negative solution to
	\begin{equation}\label{eq:prop:precise_cont_system}
		\ind{A}h = \ind{A^c} + \ind{A^c}\smash{\rateforp} h\,.
	\end{equation}
\end{proposition}

\section{Imprecise-Markov Chains}\label{sec:imp_markov}

Let us now introduce \emph{imprecise-Markov chains}~\citep{itip:stochasticprocesses,skulj2015efficient,krak2017imprecise}, which are the stochastic processes that we aim to study in this work. Their characterization is based on the theory of \emph{imprecise probabilities}~\citep{Walley:1991vk,augustin2013:itip}. 

We here adopt the ``sensitivity analysis'' interpretation of imprecise probabilities. This means that we represent an imprecise-Markov chain simply as a \emph{set} $\mathcal{P}$ of stochastic processes. Intuitively, the idea is that we collect in $\mathcal{P}$ all (traditional, ``precise'') stochastic processes that we deem to plausibly capture the dynamics of the underlying system of interest. Inferences with respect to $\mathcal{P}$ are defined using \emph{lower-} and \emph{upper} expectations, given respectively as
\begin{equation*}
	\underline{\mathbb{E}}_\mathcal{P}[\cdot\,\vert\,\cdot]\coloneqq \inf_{P\in\mathcal{P}}\mathbb{E}_P[\cdot\,\vert\,\cdot]
	\quad\text{and}\quad
	\overline{\mathbb{E}}_\mathcal{P}[\cdot\,\vert\,\cdot]\coloneqq \sup_{P\in\mathcal{P}}\mathbb{E}_P[\cdot\,\vert\,\cdot]\,.
\end{equation*}
So, their inferences represent \emph{robust}---i.e. conservative---and \emph{tight} lower- and upper bounds on inferences with respect to \emph{all} stochastic processes that we deem to be plausible.


\subsection{Sets of Processes \& Types}\label{subsec:imc_sets_types}

We already mentioned that an imprecise-Markov chain is essentially simply a set $\mathcal{P}$ of stochastic processes. Let us now consider how to define such sets.

We start by considering the discrete-time case; then, clearly, $\mathcal{P}$ will be a set of discrete-time processes. We will parameterize such a set with some non-empty set $\mathcal{T}$ of transition matrices. Our aim is then to include in $\mathcal{P}$ all processes that are in some sense ``compatible'' with $\mathcal{T}$.\footnote{We will not constrain the initial models $P(X_0)$ of the elements of $\mathcal{P}$, since in any case such a choice would not influence the inferences that we study in this work.} However, at this point we are faced with a choice about which \emph{type} of processes to include in this set, and these different choices lead to \emph{different types of imprecise-Markov chains}.

Arguably the conceptually most simple model is $\mathcal{P}^{\mathrm{HM}}_\mathcal{T}$, which contains all homogeneous Markov chains $P$ whose corresponding transition matrix is included in $\mathcal{T}$:
\begin{equation*}
	\mathcal{P}^{\mathrm{HM}}_\mathcal{T}\coloneqq \bigl\{ P\in\mathbb{P}^{\mathrm{HM}}_{\natswith}\,:\, {}^{P}T\in\mathcal{T} \bigr\}\,.
\end{equation*} 
However, we could instead consider $\mathcal{P}^\mathrm{M}_{\mathcal{T}}$, which is the set of all (not-necessarily homogeneous) Markov chains whose time-dependent transition matrices are contained in $\mathcal{T}$:
\begin{equation*}
	\mathcal{P}^{\mathrm{M}}_\mathcal{T}\coloneqq \bigl\{ P\in\mathbb{P}^{\mathrm{M}}_{\natswith}\,:\, {}^{P}T_n\in\mathcal{T}\,\text{for all $n\in\natswith$} \bigr\}\,.
\end{equation*} 
The last choice that we consider here is the set $\mathcal{P}^{\mathrm{I}}_{\mathcal{T}}$, which essentially contains \emph{all} discrete-time processes whose single-step transition dynamics are described by $\mathcal{T}$. Its characterization is more cumbersome since we have not expressed these general processes in terms of transition matrices, but we can say that it is the set of all $P\in\mathbb{P}_{\natswith}$ such that for all $n\in\natswith$ and all $x_0,\ldots,x_{n}\in\states$, there is some $T\in\mathcal{T}$ such that for all $y\in\states$ it holds that
\begin{equation*}
	P(X_{n+1}=y\,\vert\,X_0=x_0,\ldots,X_n=x_n) = T(x_n,y)\,.
\end{equation*}
This last type is called an imprecise-Markov chain under \emph{epistemic irrelevance}, whence the superscript `$\mathrm{I}$'.

Note that the three types $\mathcal{P}^\mathrm{HM}_{\mathcal{T}}, \mathcal{P}^\mathrm{M}_{\mathcal{T}}$, and $\mathcal{P}^\mathrm{I}_{\mathcal{T}}$ capture not only ``plausible'' variation in terms of parameter uncertainty---expressed through the set $\mathcal{T}$---but also variation in terms of the structural independence conditions that we consider! So, from an applied perspective, if someone is not sure whether the underlying system that they are studying is truly Markovian and/or time-homogeneous, they might choose to use different such sets in their analysis.

In the continuous-time case, we again proceed analogously. First, we fix a non-empty set $\rateset$ of rate matrices, which will be the parameter for our models. We then first consider the set $\mathcal{P}^{\mathrm{HM}}_\rateset$ of all homogeneous Markov chains whose rate matrix is included in $\rateset$:
\begin{equation*}
	\mathcal{P}^{\mathrm{HM}}_\rateset \coloneqq \bigl\{ P\in\mathbb{P}_{\realsnonneg}^{\mathrm{HM}}\,:\, \rateforp\in\rateset\bigr\}\,.
\end{equation*}
The other two types are constructed in analogy to the discrete-time case, but unfortunately we don't have the space for a complete exposition of their characterization. Instead we refer the interested reader to~\citep{krak2017imprecise,krak2021phd} for an in-depth study of these different types and comparisons between them; in what follows we limit ourselves to a largely intuitive specification. 

The model $\mathcal{P}^\mathrm{M}_\rateset$ is the set of all continuous-time (not-necessarily homogeneous) Markov chains whose transition dynamics are compatible with $\rateset$ at every point in time. This includes in particular all Markov chains $P$ satisfying the appropriate differentiability assumptions to meaningfully say that the time-dependent rate matrices $\rateforp_t$---as in Equation~\eqref{eq:time_dependent_rate}---are included in $\rateset$ for all $t\in\realsnonneg$. However, $\mathcal{P}^\mathrm{M}_\rateset$ also contains other processes that are not (everywhere) differentiable; see e.g.~\citep[Sec 4.6 and 5.2]{krak2021phd} for the technical details.

The most involved model to explain is again $\mathcal{P}^\mathrm{I}_\rateset$, which includes \emph{all} continuous-time processes whose time- and history-dependent transition dynamics can be described using elements of $\rateset$. It includes, but is not limited to, appropriately differentiable processes $P$ such that for all $n\in\natswith$, all $t_0<\cdots<t_n\in\realsnonneg$, and all $x_{t_0},\ldots,x_{t_{n}}\in\states$, there is some $Q\in\rateset$ such that for all $y\in\states$ it holds that
\begin{align*}
	\biggl(\frac{\mathrm{d}}{\mathrm{d}\,s}P(X_s=y\,\vert\,X_{t_0}=x_{t_0},\ldots,&X_{t_n}=x_{t_n})\biggr)\bigg\vert_{s=t_n} \\
	&= Q(x_{t_n},y)
\end{align*}
We again refer to~\citep[Sec 4.6 and 5.2]{krak2021phd} for the technical details involving the additional elements of $\mathcal{P}^\mathrm{I}_\rateset$ that are not appropriately differentiable. Importantly, we note the nested structure~~\citep[Prop 5.9]{krak2021phd}
\begin{equation*}
	\mathcal{P}^\mathrm{HM}_\rateset \subseteq \mathcal{P}^\mathrm{M}_\rateset \subseteq \mathcal{P}^\mathrm{I}_\rateset\,,
\end{equation*}
where the inclusions are strict provided $\rateset$ isn't trivial. 


For notational convenience, we will use identical sub- and superscripts to denote the corresponding lower- and upper expectations for any of these imprecise-Markov chains; e.g., we let $\underline{\mathbb{E}}^\mathrm{HM}_\mathcal{T}[\cdot\,\vert\,\cdot] \coloneqq \underline{\mathbb{E}}_{\mathcal{P}^\mathrm{HM}_\mathcal{T}}[\cdot\,\vert\,\cdot]$.


\subsection{Imprecise Transition Dynamics}\label{subsec:imprecise_dynamics}

Let us now introduce some machinery to describe the dynamics of imprecise-Markov chains. In particular, we here move from the set-valued parameters $\mathcal{T}$ and $\rateset$ used in Section~\ref{subsec:imc_sets_types}, to their dual representations; these are operators that can serve as computational tools. 

In Section~\ref{subsec:imc_sets_types}, we described discrete-time imprecise-Markov chains using non-empty sets $\mathcal{T}$ of transition matrices. With any such set, we can associate the corresponding \emph{lower-} and \emph{upper transition operators} $\underline{T}$ and $\overline{T}$ on $\gamblesX$, defined respectively as
\begin{equation*}
	\underline{T}f\coloneqq \inf_{T\in\mathcal{T}}Tf
	\quad\text{and}\quad
	\overline{T}f\coloneqq \sup_{T\in\mathcal{T}}Tf
	\quad\text{for all $f\in\gamblesX$.}
\end{equation*}
More generally, any operator $\underline{T}$ (resp. $\overline{T}$) on $\gamblesX$ is a \emph{lower} (resp. \emph{upper}) \emph{transition operator} if for all $f,g\in\gamblesX$, all $\lambda\in\realsnonneg$, and all $x\in\states$, it holds that~\citep{de2017limit}
\begin{enumerate}
	\item $\min_{y\in\states}f(y)\leq\underline{T}f(x)$ and $\overline{T}f(x)\leq \max_{y\in\states}f(y)$
	\item $\underline{T}f + \underline{T}g \leq \underline{T}(f+g)$ and $\overline{T}(f+g)\leq \overline{T}f+\overline{T}g$
	\item $\underline{T}(\lambda f)=\lambda\underline{T}f$ and $\overline{T}(\lambda f)=\lambda\overline{T}f$.
\end{enumerate} 
It should be noted that lower- and upper transition operators are conjugate, in that any $\underline{T}$ induces a corresponding upper transition operator $\overline{T}(\cdot)=-\underline{T}(-\cdot)$, and \emph{vice versa}. Moreover, any transition matrix $T$ is also a lower---and, by its linearity, upper---transition operator.

It is easily verified that the lower- and upper transition operators corresponding to a given non-empty set $\mathcal{T}$ are, indeed, lower- and upper transition operators. Conversely, with a given lower transition operator $\underline{T}$, we can associate the set of transition matrices that \emph{dominate} it, in the sense that
\begin{equation*}
	\mathcal{T}_{\underline{T}}\coloneqq \bigl\{ T\,:\, \text{$T$ a trans. mat.},\,Tf\geq \underline{T}f\,\text{for all $f\in\gamblesX$}\bigr\}\,.
\end{equation*}
This set satisfies the following important properties:
\begin{proposition}{\citep[Sec 3.4]{krak2021phd}}\label{prop:duality_trans}
	Let $\underline{T}$ be a lower transition operator with conjugate upper transition operator $\overline{T}(\cdot)=-\underline{T}(-\cdot)$ and dominating set of transition matrices $\mathcal{T}_{\underline{T}}$. Then $\mathcal{T}_{\underline{T}}$ is a non-empty, closed, and convex set of transition matrices that has separately specified rows,\footnote{A set $\mathcal{M}$ of matrices is said to have \emph{separately specified rows} if, intuitively, it is closed under the row-wise recombination of its elements; see e.g.~\citep{itip:stochasticprocesses} for details.} and for all $f\in\gamblesX$ it holds that $\underline{T}f = \inf_{T\in\mathcal{T}_{\underline{T}}} Tf$ and $\overline{T}f = \sup_{T\in\mathcal{T}_{\underline{T}}} Tf$.
	%\begin{equation*}
	%	\underline{T}f = \inf_{T\in\mathcal{T}_{\underline{T}}} Tf
	%	\quad\text{and}\quad
	%	\overline{T}f = \sup_{T\in\mathcal{T}_{\underline{T}}} Tf\,.
	%\end{equation*}
	Moreover, for all $f\in\gamblesX$ there is some $T\in\mathcal{T}_{\underline{T}}$ such that $Tf=\underline{T}f$, and there is some---possibly different---$T\in\mathcal{T}_{\underline{T}}$ such that $Tf=\overline{T}f$.	
\end{proposition}
Notably, there is a one-to-one relation between non-empty sets of transition matrices that are closed and convex and have separately specified rows, and lower (or upper) transition operators: if $\underline{T}$ is the lower transition operator for the set $\mathcal{T}$, and if $\mathcal{T}$ satisfies these properties, then $\mathcal{T}=\mathcal{T}_{\underline{T}}$~\citep[Cor 3.38]{krak2021phd}. Hence these objects may serve as dual representations for each other. 

One reason that this is important is the use of $\underline{T}$ as a computational tool; under the conditions of this duality it holds that for any function $f\in\gamblesX$ and any $n\in\natswith$, we can write~\citep{itip:stochasticprocesses}
\begin{equation*}
	\underline{\mathbb{E}}^\mathrm{I}_{\mathcal{T}}[f(X_n)\vert X_0=x] = \underline{\mathbb{E}}^\mathrm{M}_{\mathcal{T}}[f(X_n)\vert X_0=x] = \underline{T}^nf(x)\,,
\end{equation*}
where $\underline{T}$ is the lower transition operator for $\mathcal{T}$. This reduces the problem of computing such inferences for the imprecise-Markov chains $\mathcal{P}^\mathrm{M}_\mathcal{T}$ and $\mathcal{P}^\mathrm{I}_\mathcal{T}$ to solving $n$ independent \emph{linear} optimization problems over $\mathcal{T}$; first compute $f_1\coloneqq\underline{T}f$, then compute $f_2\coloneqq\underline{T}\,f_1=\underline{T}^2f$, and so forth. Note that this method in general only yields a conservative bound on the corresponding inference for $\mathcal{P}^\mathrm{HM}_\mathcal{T}$, as the minimizers $T_k$ that obtain $T_kf_{k-1}=\underline{T} f_{k-1}$ may be different at each step.

We next consider the dynamics in the continuous-time setting. We proceed analogously to the above: we first consider a non-empty and bounded\footnote{In the induced operator norm.} set $\rateset$ of rate matrices. With this set, we then associate the corresponding \emph{lower-} and \emph{upper rate operators} $\lrate$ and $\urate$ on $\gamblesX$, defined as
\begin{equation*}
	\lrate f\coloneqq \inf_{Q\in\rateset}Qf
	\quad\text{and}\quad
	\urate f\coloneqq \sup_{Q\in\rateset}Qf
	\quad\text{for all $f\in\gamblesX$.}
\end{equation*}
More generally, any operator $\lrate$ (resp. $\urate$) on $\gamblesX$ is a \emph{lower} (resp. \emph{upper}) \emph{rate operator} if for all $f,g\in\gamblesX$, all $\lambda\in\realsnonneg$ and $\mu\in\reals$, and all $x,y\in\states$ with $y\neq x$, it holds that~\citep{de2017limit}
\begin{enumerate}
	\item $\lrate(\mu\ones)(x)=0$ and $\urate(\mu\ones)(x)=0$
	\item $\lrate\ind{y}(x)\geq 0$ and $\urate\ind{y}(x)\geq 0$
	\item $\lrate f + \lrate g \leq \lrate(f+g)$ and $\urate(f+g)\leq \urate f+\urate g$
	\item $\lrate(\lambda f)=\lambda\lrate f$ and $\urate(\lambda f)=\lambda\urate f$
\end{enumerate}
As before, such objects are conjugate, in that if $\lrate$ is a lower rate operator, then $\smash{\urate}(\cdot)=-\smash{\lrate}(-\cdot)$ is an upper rate operator. Moreover, any rate matrix $Q$ is also a lower (and upper) rate operator.
There is again a duality between lower (or upper) rate operators, and sets of rate matrices. For fixed $\lrate$ and with the dominating set of rate matrices $\rateset_{\lrate}$ defined as
\begin{equation*}
	\rateset_{\lrate}\coloneqq \bigl\{ Q\,:\, \text{$Q$ a rate mat.},\,Qf\geq \lrate f\,\text{for all $f\in\gamblesX$}\bigr\}\,,
\end{equation*}
we have the following result:
\begin{proposition}{\citep[Sec 6.2]{krak2021phd}}\label{prop:duality_rate}
	Let $\underline{Q}$ be a lower rate operator with conjugate upper rate operator $\overline{Q}(\cdot)=-\underline{Q}(-\cdot)$ and dominating set of rate matrices $\mathcal{Q}_{\underline{Q}}$. Then $\mathcal{Q}_{\underline{Q}}$ is a non-empty, compact, and convex set of rate matrices that has separately specified rows, and for all $f\in\gamblesX$ it holds that $\underline{Q}f = \inf_{Q\in\mathcal{Q}_{\underline{Q}}} Qf$ and $\overline{Q}f = \sup_{Q\in\mathcal{Q}_{\underline{Q}}} Qf$.
	%\begin{equation*}
	%	\underline{Q}f = \inf_{Q\in\mathcal{Q}_{\underline{Q}}} Qf
	%	\quad\text{and}\quad
	%	\overline{Q}f = \sup_{Q\in\mathcal{Q}_{\underline{Q}}} Qf\,.
	%\end{equation*}
	Moreover, for all $f\in\gamblesX$ there is some $Q\in\mathcal{Q}_{\underline{Q}}$ such that $Qf=\underline{Q}f$, and there is some---possibly different---$Q\in\mathcal{Q}_{\underline{Q}}$ such that $Qf=\overline{Q}f$.	
\end{proposition}
Now fix any lower rate operator $\lrate$ and any $t\in\realsnonneg$, and let
\begin{equation}\label{eq:lower_rate_limit}
	e^{\lrate t}\coloneqq \lim_{n\to+\infty}\bigl(I+\nicefrac{t}{n}\lrate\bigr)^n\,.
\end{equation}
The operator $e^{\lrate t}$ is then a lower transition operator~\citep{de2017limit}, and the family $(e^{\lrate t})_{t\in\realsnonneg}$ is a semigroup of lower transition operators; it satisfies $e^{\lrate(t+s)}=e^{\lrate t}e^{\lrate s}$ for all $t,s\in\realsnonneg$, and $e^{\lrate 0}=I$. The analogous construction with an upper rate operator $\urate$ instead generates a semigroup $(e^{\urate t})_{t\in\realsnonneg}$ of upper transition operators.
When $\lrate$ and $\urate$ are taken with respect to the same set $\rateset$, these semigroups satisfy, for all $t\in\realsnonneg$, $f\in\gamblesX$, and $Q\in\rateset$,
\begin{equation}\label{eq:semigroup_domination}
	e^{\lrate t}f \leq e^{Qt}f \leq e^{\urate t}f\,.
\end{equation}
Here the importance again derives from the use as a computational tool; under the conditions of duality between $\rateset$ and $\lrate$, we have for any $f\in\gamblesX$ and any $t\in\realsnonneg$ that~\citep{skulj2015efficient,krak2017imprecise}
\begin{equation*}
	\underline{\mathbb{E}}^\mathrm{I}_{\mathcal{Q}}[f(X_t)\vert X_0=x] = \underline{\mathbb{E}}^\mathrm{M}_{\mathcal{Q}}[f(X_t)\vert X_0=x] = e^{\lrate t}f(x)\,.
\end{equation*}
Hence such inferences can be numerically computed by approximating~\eqref{eq:lower_rate_limit} with a finite choice of $n$, and then solving $n$ independent linear optimization problems over $\rateset$. Error bounds for this scheme are available in the literature~\citep{skulj2015efficient,krak2017imprecise,erreygers2021phd}.


\subsection{Class Structure}

Let us now fix a set $\rateset$ of rate matrices that we will use in the remainder of this work. Throughout, let $\lrate$ and $\urate$ denote the lower- and upper rate operators associated with $\rateset$. We impose several standard regularity conditions on this set: we assume that $\rateset$ is non-empty, compact, convex, and that it has separately specified rows. These are common assumptions that are imposed to ensure the duality between $\rateset$ and $\lrate$, which in turn guarantees that inferences with the induced imprecise-Markov chains remain well-behaved, as well as analytically (and, often, computationally) tractable.

We now have all the pieces to start studying the inference problem that is the subject of this work: the \emph{lower-} and \emph{upper expected hitting times} of the set $A\subset\states$ for \emph{continuous-time imprecise-Markov chains described by} $\rateset$.

Before we begin, let us impose two additional conditions on the dynamics of the system. 
\begin{assumption}\label{ass:absorbing}
	We assume that all states in $A$ are \emph{absorbing}, which is equivalent to requiring that $Q(x,x)=0$ for all $Q\in\rateset$ and all $x\in A$.
\end{assumption}
Note that this does not influence the inferences in which we are interested, since those only deal with behavior at times \emph{before} states in $A$ are reached. However, imposing this explicitly substantially simplifies the analysis.

Next, we assume that the set $A$ is \emph{lower reachable} from any state $x\in A^c$~\citep{de2017limit}. This means that we can construct a sequence $x_1,\ldots,x_{n+1}\in\states$ starting in any $x_1\in A^c$ and ending in some $x_{n+1}\in A$ such that, for all $k=1,\ldots,n$, it holds that $\lrate\ind{x_{k+1}}(x_k)>0$. This is equivalent~\citep{de2017limit} to
\begin{assumption}\label{ass:reachable}
	We assume $e^{\lrate t}\ind{A}(x)>0$ for all $t\in\realspos$ and all $x\in A^c$.
\end{assumption} 
Essentially, this means that for all elements of our imprecise-probabilistic models the probability of eventually hitting $A$ is bounded away from zero.
This ensures that the expected hitting times remain bounded for all $P\in\mathcal{P}^{\mathrm{I}}_{\rateset}$, so that we can ignore any extended real-valued analysis.  It also implies that for all $Q\in\rateset$ we have that $Q(x,x)\neq 0$ for all $x\in A^c$, which is relevant to meet the precondition of Proposition~\ref{prop:precise_cont_system}.
As a practical point, \citet{de2017limit} gives an algorithm to check whether a given set $\rateset$ satisfies this condition. 

On a technical level, Assumption~\ref{ass:reachable} is the crucial one for our results, and---unlike with Assumption~\ref{ass:absorbing}---it cannot really be ignored in practice. However, based on earlier work by~\citet{krak2019hitting} in the discrete-time setting, we hope in the future to strengthen our results to hold without this assumption.


\section{Subspace Dynamics}\label{subsec:subspace_dynamics}

In the context of hitting times, the interesting behavior of a process actually occurs \emph{before} it has reached a target state in~$A$. Hence it will be useful to introduce some machinery to study the transition dynamics as it relates to the states $A^c$.

To introduce the notation in a general way, choose any non-empty $\mathcal{Y}\subset\states$. Then for any $f\in\gamblesX$, let $f\vert_{\mathcal{Y}}\in\gamblesY{\mathcal{Y}}$ denote the restriction of $f$ to $\mathcal{Y}$. Conversely, for any $f\in\gamblesY{\mathcal{Y}}$, let $f\upX\in\gamblesX$ denote the unique extension of $f$ to $\states$ that satisfies $f(x)=0$ for all $x\in\states\setminus\mathcal{Y}$. Moreover, for any operator $M$ on $\gamblesX$, we define the operator $M\vert_{\mathcal{Y}}$ on $\gamblesY{\mathcal{Y}}$ as
\begin{equation*}
	M\vert_{\mathcal{Y}}f \coloneqq \bigl(M(f\upX)\bigr)\vert_{\mathcal{Y}}\quad\quad\text{for all $f\in\gamblesY{\mathcal{Y}}$.}
\end{equation*}
This somewhat verbose notation is perhaps most easily understood when $M$ is a linear operator, i.e. a matrix. In that case, $M\vert_{\mathcal{Y}}$ is simply the $\abs{\mathcal{Y}}\times\abs{\mathcal{Y}}$ sub-matrix of $M$ on the coordinates in $\mathcal{Y}$. The definition above allows us to extend this notion also to non-linear operators, and to lower- and upper transition and rate operators, specifically.




Now for any rate matrix $Q\in\rateset$, we call $G\coloneqq Q\resAc$ its corresponding \emph{subgenerator}. For any $t\in\realsnonneg$, we then define $e^{Gt}\coloneqq e^{Qt}\resAc$. We have the following result:
\begin{proposition}\label{prop:subsemigroup_precise}
	Fix $Q\in\rateset$ and let $G$ be its subgenerator. Then $e^{Gt} = \lim_{n\to+\infty} \bigl(I+\nicefrac{t}{n}G\bigr)^n$ for all $t\in\realsnonneg$.
	Moreover, the family $(e^{Gt})_{t\in\realsnonneg}$ is a semigroup.
\end{proposition}

Analogously, we define $\lsubgen\coloneqq\lrate\resAc$ and $\usubgen\coloneqq \urate\resAc$ to be the \emph{lower-} and \emph{upper subgenerators} corresponding to $\lrate$ and $\urate$, respectively. We also let $e^{\lsubgen t}\coloneqq e^{\lrate t}\resAc$ and $e^{\usubgen t}\coloneqq e^{\urate t}\resAc$. Perhaps unsurprisingly, we then have:
\begin{proposition}\label{prop:subsemigroup_imprecise}
	It holds that $e^{\lsubgen t} = \lim_{n\to+\infty} \bigl(I+\nicefrac{t}{n}\lsubgen\bigr)^n$ and $e^{\usubgen t} = \lim_{n\to+\infty} \bigl(I+\nicefrac{t}{n}\usubgen\bigr)^n$ for all $t\in\realsnonneg$. Moreover, the families $(e^{\lsubgen t})_{t\in\realsnonneg}$, $(e^{\usubgen t})_{t\in\realsnonneg}$ are semigroups.
\end{proposition}

Our Assumption~\ref{ass:reachable} implies the norm bound:
\begin{proposition}\label{prop:upper_subsemigroup_contractive}
	For any $t>0$, it holds that $\norm{\smash{e^{\usubgen t}}}<1$.
\end{proposition}

It is a straightforward consequence of the use of the supremum norm, together with Equation~\eqref{eq:semigroup_domination} and the fact that $e^{Qt}$ and $e^{\urate t}$ are (upper) transition operators, that also $\norm{\smash{e^{Gt}}} \leq \norm{\smash{e^{\usubgen t}}} <1$ for all $t\in\realspos$. Hence by the semigroup property we immediately have that $\lim_{t\to+\infty}\norm{\smash{e^{Gt}}}=0$. This also implies the following well-known result.
\begin{proposition}{\citep[Thm IV.1.4]{taylor1958introduction}}\label{prop:resolvent_existence}
	For any $Q\in\rateset$ with subgenerator $G$, and all $t>0$, the inverse operator $(I-e^{G t})^{-1}$ exists, and $(I-e^{G t})^{-1}=\sum_{k=0}^{+\infty}e^{Gtk}$.
\end{proposition}
This allows us to characterize hitting times for discrete-time homogeneous Markov chains whose transition matrix is given by $e^{Qt}$, as follows.
\begin{proposition}\label{prop:discrete_precise_by_inverse}
	Choose any $Q\in\rateset$, let $G$ be its subgenerator, and fix any $\Delta>0$. Let $P\in\mathbb{P}^{\mathrm{HM}}_{\natswith}$ be such that ${}^{P}T=e^{Q\Delta}$. Then the expected hitting times $h\coloneqq \mathbb{E}_P[\tau_{\natswith}\,\vert\,X_0]$ satisfy $h\resAc = (I-e^{G\Delta})^{-1}\ones$ and $h(x)=0$ for all $x\in A$.
\end{proposition}
\begin{proof}
	By Proposition~\ref{prop:precise_discr_system}, in $x\in A^c$ we have that
	\begin{equation*}
		h(x) = \ind{A^c}(x) + \ind{A^c}(x) e^{Q\Delta}h(x) = 1 + e^{Q\Delta}h(x)\,.
	\end{equation*}
	Conversely, it is immediate from the definition that $h(x)=0$ for all $x\in A$. This implies that $h=(h\resAc)\upX$, and hence
	\begin{equation*}
		h\resAc = \ones + \bigl(e^{Q\Delta}(h\resAc)\upX\bigr)\resAc = \ones + e^{G\Delta}h\resAc\,.
	\end{equation*}
	Re-ordering terms we have $(I - e^{G\Delta})h\resAc = \ones$. Now use Proposition~\ref{prop:resolvent_existence} and multiply with $(I-e^{G\Delta})^{-1}$.
\end{proof}

We need the following observation:
\begin{lemma}\label{lemma:subgen_negative_eigen}
	Consider any $Q\in\rateset$ with subgenerator $G$, and let $\sigma(G)$ be the set of eigenvalues of $G$. Then $\mathrm{Re}\,\lambda < 0$ for all $\lambda\in\sigma(G)$.
\end{lemma}

This implies that $0\notin\sigma(G)$, and so we have:
\begin{corollary}\label{cor:subgen_inverse}
	For any $Q\in\rateset$ with subgenerator $G$, the inverse operator $G^{-1}$ exists.
\end{corollary}


This allows us to characterize hitting times for continuous-time homogeneous Markov chains:
\begin{proposition}\label{prop:cont_precise_by_inverse}
	Choose any $Q\in\rateset$, let $G$ be its subgenerator, and let $P\in\mathbb{P}_{\realsnonneg}^{\mathrm{HM}}$ with $\rateforp=Q$. Then the expected hitting times $h\coloneqq \mathbb{E}_P[\tau_{\realsnonneg}\,\vert\,X_0]$ satisfy $h\resAc = -G^{-1}\ones$ and $h(x)=0$ for all $x\in A$.
\end{proposition}
\begin{proof}
	By Proposition~\ref{prop:precise_cont_system}, in $x\in A^c$ we have that
	\begin{equation*}
		-1 = -\ind{A^c}(x) = \ind{A^c}(x) Qh(x) = Qh(x)\,.
	\end{equation*}
	Conversely, it is immediate from the definition that $h(x)=0$ for all $x\in A$. This implies $h = (h\resAc)\upX$, and hence
	\begin{equation*}
		Gh\resAc = \bigl(Q(h\resAc)\upX)\bigr\resAc = (Qh)\resAc = -\ones\,.
	\end{equation*}
	Now use Corollary~\ref{cor:subgen_inverse} and multiply with $G^{-1}$.
\end{proof}

\subsection{Quasicontractivity of Subspace Dynamics}\label{sec:quasicontractive}

We already know from Proposition~\ref{prop:upper_subsemigroup_contractive} that $\norm{\smash{e^{\usubgen t}}}<1$ for all $t\in\realspos$. Since $\smash{e^{\usubgen 0}}=I$ (because it is a semigroup), it follows that $\norm{\smash{e^{\usubgen t}}}\leq 1$ for all $t\in\realsnonneg$. A semigroup that satisfies this property is said to be \emph{contractive}. Moreover, Proposition~\ref{prop:upper_subsemigroup_contractive} together with the semigroup property implies that $\lim_{t\to+\infty}\norm{\smash{e^{\usubgen t}}}=0$. A semigroup that satisfies this property is said to be \emph{uniformly exponentially stable}, and in such a case the following result holds:
\begin{proposition}\label{prop:subsemigroup_ues}
	There are $M\geq 1$ and $\xi>0$ such that $\norm{\smash{e^{\usubgen t}}} \leq M e^{-\xi t}$ for all $t\in\realsnonneg$.
\end{proposition}
This result means that the norm $\norm{\smash{e^{\usubgen t}}}$ decays exponentially as $t$ grows. However, for technical reasons we require an exponentially decaying norm bound with $M=1$; if this holds the semigroup is said to be \emph{quasicontractive}.

It is not clear that obtaining such a bound is possible when $\norm{\smash{e^{\usubgen t}}}$ is induced by the supremum norm $\norm{\cdot}$ on $\gamblesAc$. However, we can get it by defining a \emph{different} norm $\norm{\cdot}_*$ on $\gamblesAc$. We then obtain the quasicontractivity with respect to the induced operator norm $\norm{\cdot}_*$. Because $\gamblesAc$ is finite-dimensional these norms are equivalent, and such a result suffices for our purposes. This re-norming trick is originally due to~\citet{feller1953generation}, and an analogous construction is commonly used for semigroups of linear operators; see e.g.~\citep[Thm 12.21]{renardyrogers2004intropde}.

So, consider the $\xi>0$ from Proposition~\ref{prop:subsemigroup_ues}, and let
\begin{equation}\label{eq:alternative_norm}
	\norm{f}_* \coloneqq \sup_{t\in\realsnonneg} \norm{e^{\xi t}e^{\usubgen t}\abs{f}}\quad\text{for all $f\in\gamblesAc$,}
\end{equation} 
where $\abs{f}$ denotes the elementwise-absolute value of $f$.

\begin{proposition}\label{prop:newnorm_is_norm}
	The map $f\mapsto\norm{f}_*$ is a norm on $\gamblesAc$.
\end{proposition}
Moreover, we have the desired result:
\begin{proposition}\label{prop:renormed_quasicontractive}
	We have $\norm{\smash{e^{\usubgen t}}}_* \leq e^{-\xi t}$ for all $t\in\realsnonneg$.
\end{proposition}
Finally, the same bound holds for precise models:
\begin{proposition}\label{prop:precise_quasicontractive}
	For any $Q\in\rateset$ with subgenerator $G$ it holds that $\norm{e^{Gt}}_*\leq e^{-\xi t}$ for all $t\in\realsnonneg$.
\end{proposition}


\section{Hitting Times as Limits}\label{sec:hits_as_limits}

We now have all the pieces to explain the proof of our main results. The trick will be to establish a connection between hitting times for continuous-time imprecise-Markov chains, and hitting times for \emph{discrete}-time imprecise-Markov chains, for which analogous results were previously established by~\citet{krak2019hitting}. 

We essentially just look at a discretized continuous-time Markov chain taking steps of some fixed size $\Delta>0$, derive the expected hitting time for this discrete-time Markov chain, and then take the limit $\Delta\to 0^+$. The main difficulty is in establishing that this converges uniformly for all elements in our sets of processes; this is why we went through the trouble of establishing quasicontractivity in Section~\ref{sec:quasicontractive}. 

To start, for any $Q\in\rateset$ and $\Delta>0$, let $h^Q_\Delta$ be the minimal non-negative solution to the linear system\footnote{Note the re-scaled term $\Delta\ind{A^c}$ on the right-hand side, which distinguishes this from the system in Proposition~\ref{prop:precise_discr_system}; this is required since the hitting times for discrete-time Markov chains are expressed in the \emph{number} of steps, and to pass to continuous-time we need to measure the size of these steps.}
\begin{equation}
	h^Q_\Delta = \Delta \ind{A^c} + \ind{A^c} e^{Q\Delta}h^Q_\Delta\,,
\end{equation}
and let $h^Q$ be the minimal non-negative solution to
\begin{equation}
	\ind{A}h^Q = \ind{A^c} + \ind{A^c}Qh^Q\,.
\end{equation}
Then we know from Propositions~\ref{prop:precise_discr_system} and~\ref{prop:precise_cont_system} that $\nicefrac{1}{\Delta}h^Q_\Delta$ represents the expected hitting times of a discrete-time homogeneous Markov chain with transition matrix $e^{Q\Delta}$, and that $h^Q$ does the same for a continuous-time homogeneous Markov chain with rate matrix $Q$. We now have the following result:
\begin{proposition}\label{prop:precise_uniform_limit}
	There are $\delta>0$ and $L>0$ such that $\norm{h^Q_\Delta - h^Q}<\Delta L\norm{h^Q}$ for all $\Delta\in(0,\delta)$ and all $Q\in\rateset$.
\end{proposition}

Since $\norm{h^Q}$ is bounded due to Proposition~\ref{prop:cont_precise_by_inverse}:
\begin{corollary}\label{cor:precise_discretisation_converges}
	We have $\lim_{\Delta\to 0^+}h^Q_\Delta=h^Q$ for all $Q\in\rateset$.
\end{corollary}

We will now set up the analogous results for imprecise-Markov chains. First, let
\begin{equation}
	\underline{h}\coloneqq \inf_{Q\in\rateset} h^Q 
	\quad\text{and}\quad
	\overline{h}\coloneqq \sup_{Q\in\rateset} h^Q\,.
\end{equation}
Clearly, it follows from Proposition~\ref{prop:precise_cont_system} and the definition of lower- and upper expectations that these quantities represent the lower- and upper expected hitting times for the imprecise-Markov chain $\mathcal{P}_\rateset^{\mathrm{HM}}$, i.e. it holds that
\begin{equation*}
	\underline{h} = \underline{\mathbb{E}}_\rateset^{\mathrm{HM}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]
	\quad\text{and}\quad
	\overline{h} = \overline{\mathbb{E}}_\rateset^{\mathrm{HM}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]\,.
\end{equation*}
Now for any $\Delta>0$, let $\underline{h}_\Delta$ and $\overline{h}_\Delta$ denote the minimal non-negative solutions to the \emph{non-linear} systems
\begin{equation}\label{eq:lower_discretizes_system}
	\underline{h}_\Delta = \Delta\ind{A^c} + \ind{A^c} e^{\lrate \Delta}\underline{h}_\Delta
\end{equation}
and
\begin{equation}\label{eq:upper_discretizes_system}
	\overline{h}_\Delta = \Delta\ind{A^c} + \ind{A^c} e^{\urate \Delta}\overline{h}_\Delta\,.
\end{equation}
It was previously shown by~\citet{krak2019hitting} that---up to re-scaling with $\nicefrac{1}{\Delta}$---the quantities $\underline{h}_\Delta$ and $\overline{h}_\Delta$ represent the lower (resp. upper) expected hitting times of, identically, the discrete-time imprecise-Markov chains $\mathcal{P}_{\mathcal{T}_\Delta}^{\mathrm{HM}}$, $\mathcal{P}_{\mathcal{T}_\Delta}^{\mathrm{M}}$, and $\mathcal{P}_{\mathcal{T}_\Delta}^{\mathrm{I}}$ parameterized by the set $\mathcal{T}_{\Delta}$ of transition matrices that dominate $e^{\lrate\Delta}$. We now set out of prove an analogous result for continuous-time imprecise-Markov chains. We start with the following:
\begin{proposition}\label{prop:imprecise_limit}
	It holds that $\lim_{\Delta\to 0^+}\underline{h}_\Delta =  \underline{h}$ and $\lim_{\Delta\to 0^+}\overline{h}_\Delta = \overline{h}$.
\end{proposition}
This property allows us to leverage recent results by~\citet{erreygers2021phd} and~\citet{krak2021phd} regarding discrete and finite approximations of lower- and upper expectations in continuous-time imprecise-Markov chains, to obtain our first main result:
\begin{theorem}\label{thm:hitting_times_invariant}
	It holds that
	\begin{equation*}
		\underline{h} = \underline{\mathbb{E}}_\rateset^{\mathrm{HM}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr] = \underline{\mathbb{E}}_\rateset^{\mathrm{M}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr] = \underline{\mathbb{E}}_\rateset^{\mathrm{I}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]\,,
	\end{equation*}
	and, moreover, that
	\begin{equation*}
		\overline{h} = \overline{\mathbb{E}}_\rateset^{\mathrm{HM}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr] = \overline{\mathbb{E}}_\rateset^{\mathrm{M}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr] = \overline{\mathbb{E}}_\rateset^{\mathrm{I}}\bigl[\tau_{\realsnonneg}\,\vert\,X_0\bigr]\,.
	\end{equation*}
\end{theorem}
Moreover, it follows relatively straightforwardly from Proposition~\ref{prop:imprecise_limit} that the lower- and upper expected hitting times for continuous-time imprecise-Markov chains satisfy an immediate generalization of the system that characterizes the expected hitting times for (precise) continuous-time homogeneous Markov chains. This is our second main result:
\begin{theorem}\label{thm:lower_upper_hitting_system}
	Let $\underline{h}$ and $\overline{h}$ denote the lower- and upper expected hitting times for any one of $\mathcal{P}^\mathrm{HM}_\rateset$, $\mathcal{P}^\mathrm{M}_\rateset$, or $\mathcal{P}^\mathrm{I}_\rateset$. Then $\underline{h}$ is the minimal non-negative solution to the non-linear system $\ind{A}\underline{h}=\ind{A^c} + \ind{A^c}\lrate\,\underline{\vphantom{Q}h}$, and $\overline{h}$ is the minimal non-negative solution to the non-linear system $\ind{A}\overline{h}=\ind{A^c} + \ind{A^c}\urate\,\overline{h}$.
\end{theorem}

\section{Summary \& Conclusion}\label{sec:summary}

We have investigated the problem of characterizing expected hitting times for continuous-time imprecise-Markov chains. We have shown that under two relatively mild assumptions on the system's class structure---\emph{viz.} that the target states are absorbing, and can be reached by any non-target state---the corresponding lower (resp. upper) expected hitting time is the same for all three types of imprecise-Markov chains.

We have also demonstrated that these lower- and upper expected hitting times $\underline{h}$ and $\overline{h}$ satisfy the non-linear systems
\begin{equation*}
	\ind{A}\underline{h}=\ind{A^c} + \ind{A^c}\lrate\,\underline{\vphantom{Q}h}
	\quad\text{and}\quad
	\ind{A}\overline{h} = \ind{A^c} + \ind{A^c}\urate\,\overline{h}\,,
\end{equation*}
in analogy with the precise linear system~\eqref{eq:prop:precise_cont_system}. 
Indeed, we conclude that the lower- and upper expected hitting times for any of these three types of imprecise-Markov chains, can be fully characterized as the unique \emph{minimal} non-negative solutions to these respective systems.

We aim to strengthen these results in future work to hold with fewer assumptions on the system's class structure.
%
%\begin{contributions} % will be removed in pdf for initial submission,
%	% so you can already fill it to test with the
%	% ‘accepted’ class option
%	I did the thing.
%\end{contributions}

\begin{acknowledgements} % will be removed in pdf for initial submission,
	% so you can already fill it to test with the
	% ‘accepted’ class option
	
	We would like to sincerely thank Jasper De Bock for many stimulating discussions on the subject of imprecise-Markov chains, and for pointing out a technical error in an earlier draft of this work. We are also grateful for the constructive feedback of three anonymous reviewers.
\end{acknowledgements}


\bibliography{krak_660}


\end{document}
