% \documentclass{uai2022} % for initial submission
\documentclass[accepted]{uai2022} % after acceptance, for a revised
                                    % version; also before submission to
                                    % see how the non-anonymous paper
                                    % would look like
%% There is a class option to choose the math font
% \documentclass[mathfont=ptmx]{uai2022} % ptmx math instead of Computer
                                         % Modern (has noticable issues)
% \documentclass[mathfont=newtx]{uai2022} % newtx fonts (improves upon
                                          % ptmx; less tested, no support)
% NOTE: Only keep *one* line above as appropriate, as it will be replaced
%       automatically for papers to be published. Do not make any other
%       change above this note for an accepted version.

%% Choose your variant of English; be consistent
\usepackage[american]{babel}
% \usepackage[british]{babel}

%% Some suggested packages, as needed:
\usepackage{natbib} % has a nice set of citation styles and commands
    \bibliographystyle{plainnat}
    \renewcommand{\bibsection}{\subsubsection*{References}}
\usepackage{mathtools} % amsmath with fixes and additions
% \usepackage{siunitx} % for proper typesetting of numbers and units
\usepackage{booktabs} % commands to create good-looking tables
\usepackage{tikz} % nice language for creating drawings and diagrams

\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{amsfonts}
\usepackage{algorithm}
\usepackage{algorithmic}
\usepackage{float}
%% Provided macros
% \smaller: Because the class footnote size is essentially LaTeX's \small,
%           redefining \footnotesize, we provide the original \footnotesize
%           using this macro.
%           (Use only sparingly, e.g., in drawings, as it is quite small.)

%% Self-defined macros
\newcommand{\swap}[3][-]{#3#1#2} % just an example
\newtheorem{definition}{Definition}
\newtheorem{theorem}{Theorem}

\title{PDQ-Net: Deep Probabilistic Dual Quaternion Network for Absolute Pose Regression on $SE(3)$}

% The standard author block has changed for UAI 2022 to provide
% more space for long author lists and allow for complex affiliations
%
% All author information is authomatically removed by the class for the
% anonymous submission version of your paper, so you can already add your
% information below.
%
% Add authors
%\author[1]{\href{mailto:<jj@example.edu>?Subject=Your UAI 2022 paper}{Jane~J.~von~O'L\'opez}{}}
\author[1]{Wenjie~Li}
\author[2]{Wasif~Naeem}
\author[1]{Jia~Liu}
\author[1]{Dequan~Zheng}
\author[1]{Wei~Hao}
\author[1]{Lijun~Chen}
%\author[3]{Further~Coauthor}
%\author[3,1]{Further~Coauthor}
% Add affiliations after the authors
\affil[1]{%
    Department of Computer Science and Technology\\
    Nanjing University\\
    Nanjing, China
}
\affil[2]{%
    School of Electronics, Electrical Engineering and Computer Science\\
    Queen's University Belfast\\
    Belfast, UK
}
%\affil[3]{%
%    Another Affiliation\\
%    Address\\
%    …
%  }
  
  \begin{document}
\maketitle

\begin{abstract}
 % This is the abstract for this article.
 % It should give a self-contained single-paragraph summary of the article's contents, including context, results, and conclusions.
 % Avoid citations; but if you do, you must give essentially the whole reference.
 % For example: This whole paper is devoted to praising É. Š. Åland von Vèreweg's most recent book (“Utopia's government formation problems during the last millenium”, Springevier Publishers, 2016).
 % Also, do not put mathematical notation and abbreviations in your abstract; be descriptive.
 % So not “we solve \(x^2+A xy+y^2\), where \(A\) is an RV”, but “we solve quadratic equations in two unknowns in which a single coefficient is a random variable”.
 % The reason is that mathematical notation will not display correctly when the abstract is reused on the proceedings website, for example, and that one should not assume the abstract's reader knows the abbreviation.
 % Of course the same remarks hold for your paper's title.
  Accurate absolute pose regression is one of the key challenges in robotics and computer vision. Existing direct regression methods suffer from two limitations. First, some noisy scenarios such as poor illumination conditions are likely to result in the uncertainty of pose estimation. Second, the output n-dimensional feature vector in the Euclidean space $\mathbb{R}^n$ cannot be well mapped to $SE(3)$ manifold. In this work, we propose a deep dual quaternion network that performs the absolute pose regression on $SE(3)$.  We first develop an antipodally symmetric probability distribution over the unit dual quaternion on $SE(3)$ to model uncertainties and then propose an intermediary differential representation space to replace the final output pose, which avoids the mapping problem from $\mathbb{R}^n$ to $SE(3)$. In addition, we introduce a backpropagation method that considers the continuousness and differentiability of the proposed intermediary space. Extensive experiments on the camera re-localization task on the Cambridge Landmarks and 7-Scenes datasets demonstrate that our method greatly improves the accuracy of the pose as well as the robustness in dealing with uncertainty and ambiguity, compared to the state-of-the-art.
\end{abstract}

\section{Introduction}\label{sec:intro}

The absolute pose estimation refers to inferring the object's pose (i.e., position and orientation) in  3D space from 2D input images, which is an age-old problem in the fields of robotics \citep{murORB2,Wang2018EndtoendSP,9440682} and computer vision \citep{Kendall2015PoseNetAC,Deng2019PoseRBPFAR,turkoglu2021visual}. 
%Accurate absolute pose regression on $SE(3)$ is an age-old problem in the fields of robotics \citep{murORB2,Wang2018EndtoendSP,9440682} and computer vision \citep{Kendall2015PoseNetAC,Deng2019PoseRBPFAR}. 
Early research focuses on geometry-based approaches due to their reliability and accuracy in some static environments. However, the geometry-based approaches cannot function well in some specific scenarios such as poor illumination and the textureless case.

In recent years, the deep learning technique has provided us with an alternative vehicle to regress the absolute pose. In these approaches, a proper output for representing the pose is particularly important when dealing with more complicated environments. A number of deep models have been presented to output an n-dimensional feature vector that is used to directly denote the pose, where the rotation is mostly represented by the Euler angle, unit quaternion, or rotation matrix, and translation is substituted by a $3$-dimensional vector \citep{Wang2018EndtoendSP,sattler2019understanding}.

In spite of some advancements, these learning-based techniques suffer from one or two limitations. First, some work cannot capture and model pose uncertainties in practical scenarios, which affects the estimation accuracy. Second, an n-dimensional feature vector on a Euclidean space $\mathbb{R}^n$ is taken as the common output by existing work \citep{Shotton2013SceneCR,sattler2019understanding,Xue2019BeyondTS}, which however cannot be well mapped to $SE(3)$ manifold since the pose in $SE(3)$ is not homeomorphic to the Euclidean space $\mathbb{R}^n$ \citep{lavalle_2006}. For example, the angle $0$ and $2\pi$ in $\mathbb{R}^n$ can map to the same rotation in $SE(3)$, which fails to satisfy the one-to-one mapping feature in homeomorphism theory.

In this paper, we propose a probabilistic deep dual quaternion network that regresses the absolute pose on $SE(3)$ from a single RGB image. Compared to existing learning-based approaches, our model has three advanced features. First, we take pose uncertainties into consideration by introducing an antipodally symmetric distribution over the unit dual quaternion on $SE(3)$. In this way, the pose regression problem can be converted into a deep probabilistic problem. Second, we propose to estimate the pose indirectly by presenting an intermediary differential representation space as the output of our deep probabilistic model. Afterward, the rotation and translation can be derived from the intermediary representation space by modeling a quadratically-constrained quadratic program (QCQP) problem and a Gaussian process respectively, which avoids the issue of mapping from $\mathbb{R}^n$ to $SE(3)$. Third, we introduce a backpropagation method that considers the continuousness and differentiability of the proposed intermediary space. Extensive experiments on the camera re-localization task on the Cambridge Landmarks and 7-Scenes datasets demonstrate that our method outperforms the state-of-the-art, in terms of pose accuracy and robustness.

In summary, our work makes the following contributions:
\begin{itemize}
    \item We develop an antipodally symmetric probability distribution over the unit dual quaternion on $SE(3)$ to model the pose uncertainty, which is the key factor in improving the accuracy of pose regression.% which convert the pose estimation problem into a directional statistics problem.
    \item We propose a deep dual quaternion model to regress the pose indirectly, which effectively addresses the mapping problem from $\mathbb{R}^n$ to $SE(3)$.
    Additionally, a backpropagation method is introduced to the proposed model. %by introducing an intermediary differential representation space. Additionally, a new backpropagation method is developed considering the continuousness and differentiability of proposed intermediary space.
    \item We implement our deep probabilistic model on the Cambridge Landmarks and 7-Scenes datasets. Extensive experiment results show that our method greatly improves the accuracy of the pose as well as the ability in dealing with uncertainty and ambiguity, compared to the state-of-the-art.
    
    %robustness in capturing uncertainties, compared with the state-of-the-art.
    %\item We implement our deep probabilistic model to regress the relative pose of visual odometry on the KITTI dataset showing that our method not only outperforms the state-of-the-art in terms of pose accuracy but also shows an advantage on the robust performance in capturing pose uncertainties.
\end{itemize}

\section{Related Work}\label{sec:relatedwork}
Absolute pose estimation combining with the deep learning technique becomes a hot topic in robotics and computer vision field. The related techniques roughly fall into two categories: direct pose regression and indirect pose regression.

Direct pose regression aims to directly regress the absolute pose from sequential RGB images by utilizing various well-designed end-to-end networks \citep{Wang2018EndtoendSP,clark2017vidloc,Kendall2015PoseNetAC,Shotton2013SceneCR,Chen2021WideBaselineRC}. In these approaches, most of them follow the same pipeline: features are extracted using a defined network such as the PoseNet \citep{Kendall2015PoseNetAC}, the BranchNet \citep{7989663}, which are then embedded into a high-dimensional vector that lies in the Euclidean space $\mathbb{R}^n$. Then \citet{sattler2019understanding} pointed out that this embedding layer typically corresponds to the output of the second-to-last layer in direct pose regression methods. The last layer performs a linear projection from the embedding space to the space of poses. However, these methods commonly suffer from one of two issues. First, the uncertainty of poses may result in the degradation of the accuracy of predicted poses since some of them are not robust enough to capture uncertainties \citep{7487679}. Second, the output n-dimensional feature vector lying on a Euclidean space $\mathbb{R}^n$ may not be well mapped to $SE(3)$ \citep{lavalle_2006}.

A strategy to overcome these problems is to regress the absolute pose indirectly, which is achieved by regressing an intermediary vector to replace the output feature vector \citep{deng2020deep,bui20206d}. 
However, it is usually hard to find such a representation space. \citet{Poursaeed2018DeepFM} introduced a Siamese model for uncalibrated cameras to regress a fundamental matrix that serves as an intermediary representation of camera poses, but it fails to capture uncertainties of poses. Recently, deep probabilistic models have been developed by regressing essential parameters of the probabilistic distribution. \citet{Gilitschenski2020Deep} introduced a deep Bingham model for the object orientation estimation on $SO(3)$ by regressing the orthogonal matrix of the Bingham distribution, where pose uncertainties are modeled as a Bingham distribution. \citet{NEURIPS2020_33cc2b87} similarly developed a deep matrix Fisher distribution for object rotation estimation on $SO(3)$ by regressing the parameter matrix of the Fisher distribution. But it is generally hard to find such statistic distributions on $SE(3)$ manifold to measure pose uncertainties.

To this point, we develop an antipodally symmetric probability distribution over the unit dual quaternion on $SE(3)$ to model pose uncertainties. Based thereon, we present a deep probabilistic distribution to indirectly regress the absolute pose on $SE(3)$.


\section{Unit Dual Quaternion Distribution}\label{sec:unitdualquatdistro}

This section gives the definition of the unit dual quaternion distribution on $SE(3)$. For this purpose, we first briefly revisit the concept of dual quaternions and then give the description of the unit dual quaternion distribution.

\subsection{Dual Quaternion}\label{subsec:dualquaternion}
In this work, a quaternion $\mathbf{q}$ is defined as $\mathbf{q}=q_0+q_1\mathbf{i}+q_2\mathbf{j}+q_3\mathbf{k}$, the $\{\mathbf{i},\mathbf{j},\mathbf{k}\}$ is the standard basis of the three-dimensional Euclidean space $\mathbb{R}^3$. For convenience,
we bring the vector $\mathbf{q}=[q_0,\mathbf{q}_{vec}]\in \mathbb{R}^4$ to denote the quaternion. The multiplication between two arbitrary quaternions can be done with a matrix-vector form that is given by
\begin{align*}
\mathbf{p} \odot \mathbf{q}= \mathbf{R}_{q} \mathbf{p}=\begin{bmatrix}
    q_0 & -\mathbf{q}_{vec}^{T} \\
    \mathbf{q}_{vec} & -\mathbf{q}_{vec}^{\times}+q_{0}\mathbf{I}_{3}
\end{bmatrix} \mathbf{p},
\end{align*}
where $[\mathbf{a}]^{\times}$ denotes the skew-symmetric matrix formed from the vector $\mathbf{a}$, and $\mathbf{I}$ refers to the identity matrix.

The norm of a quaternion is defined as $\sqrt{\mathbf{q}\odot \mathbf{q}^{*}}$, with $\mathbf{q}^{*}=[q_0, -\mathbf{q}_{vec}]$ being the conjugate of $\mathbf{q}$. And quaternions with a unit norm are called unit quaternions, which are used for denoting the pure rotation on the unit hypersphere $\mathbb{S}^{3} \subset \mathbb{R}^4$.

Dual quaternion consists of the real part quaternion and dual part quaternion which is a convenient tool for encapsulating the rotation and translation \citep{leclercq20133d},
\begin{equation}
    \mathbf{v} = \mathbf{q}_r+\epsilon \mathbf{q}_d,  \epsilon\neq 0, \epsilon^2=0,
\end{equation}
where $\mathbf{q}_r$ is the unit quaternion for indicating the rotation, and $\mathbf{q}_d$ is the dual part quaternion for representing the composition of the rotation quaternion and translation quaternion $\mathbf{q}_t=[0,t_x, t_y, t_z]^{T}$, with $\mathbf{q}_d=0.5\mathbf{q}_t\odot \mathbf{q}_r \in \mathbb{R}^4$.

Since the dual part $\mathbf{q}_d$ is orthogonal to the real part $\mathbf{q}_r$ on the hypersphere space $\mathbb{S}^3$, we further get the unit dual quaternion manifold $\mathbb{DH}_{1}:=\{[\mathbf{q}_r^{T}, \mathbf{q}_d^{T}]^{T} | \Vert\mathbf{q}_r \Vert=1, \mathbf{q}_r \in \mathbb{S}^3, \mathbf{q}_r^{T}\mathbf{q}_d=0\}\subset \mathbb{R}^{8}$. Furthermore, the translation vector $\mathbf{t}=[t_x,t_y,t_z]$ can be recovered from $\mathbb{DH}_1$ according to \citep{Li2021UnscentedDQ}, which can be written as
\begin{equation}\label{translation_recover}
    \mathbf{t} = 2[\mathbf{R}_{qr}]_{1:3}^{T}\mathbf{q}_d,
\end{equation}
where $[\mathbf{R}_{qr}]_{1:3}$ refers to the last three columns of the right multiplication matrix $\mathbf{R}_{qr}$.

\subsection{Exponential Distribution of Unit Dual Quaternion}\label{subsec:expo_dualquaternion}
Unit dual quaternions $\mathbf{v}$ and $-\mathbf{v}$ denote the same transformation because of the property of unit quaternions, namely $\mathbf{q}_r= -\mathbf{q}_r$. We assume there exists an antipodally symmetric exponential distribution %\footnote{Generally, a distribution $F$ on the space $S$ is said to be antipodally symmetric if $F(-\mathbf{v}) =F(\mathbf{v})$ for all $\mathbf{v} \in S$, which means the opposite points on $S$ have equal probability.} 
of unit dual quaternions $\mathbf{v} \in \mathbb{DH}_1$ \footnote{Generally, the distribution $f$ on the space $S$ is said to be antipodally symmetric if $f(-\mathbf{v}) =f(\mathbf{v})$ for all $\mathbf{v} \in S$, which means the opposite points on $S$ have equal probability.} . %for measuring the uncertainty of poses on the $SE(3)$ manifold. 
Previous work \citep{Gilitschenski2014ANP} offered an exponential distribution for representing the orientation and position on $SE(2)$ manifold but failed to be applied on $SE(3)$. In this work, we bridge this gap by developing the unit dual quaternion distribution on $SE(3)$.
\begin{definition}
    A vector $\mathbf{v} \in \mathbb{DH}_1 \subset \mathbb{R}^{8}$ can be modeled as an antipodally symmetric distribution if its probability density function has the following form
    \begin{equation}\label{DistributionDef}
        f(\mathbf{v})=\frac{1}{N(\mathbf{F})}\exp(\mathbf{v}^{T}\mathbf{F}\mathbf{v}),
    \end{equation}
    where $N(\mathbf{F})$ refers to the normalization constant of the proposed distribution. And the real symmetric matrix $\mathbf{F} \in \mathbb{R}^{8 \times 8}$ is the parameter matrix.
\end{definition}

We split the vector $\mathbf{v}=[\mathbf{q}^{T}_r, \mathbf{q}^{T}_d]^{T}$ with $\mathbf{q}_r \in \mathbb{S}^{3}$ and $\mathbf{q}_d \in \mathbb{R}^{4}$. Meanwhile, we decompose the symmetric parameter matrix $\mathbf{F}$ as follows,
\begin{align*}
\mathbf{F} = 
\begin{bmatrix}
        \mathbf{F}_1 & \mathbf{F}_2 \\
        \mathbf{F}_{2}^{T} & \mathbf{F}_3
\end{bmatrix}, \mathbf{F}_i \in \mathbb{R}^{4 \times 4}, i=1,2,3.
\end{align*}

Then the exponential distribution (\ref{DistributionDef}) can be rewritten as
\begin{equation}\label{distributionfinal}
\begin{aligned}
    f(\mathbf{v})&=\frac{1}{N(\mathbf{F})}\exp \underbrace{(\mathbf{q}_r^{T}(\mathbf{F}_1-\mathbf{F}_2\mathbf{F}_3^{-1}\mathbf{F}_2^{T})\mathbf{q}_r}_{Bingham-like}+ \\
    & \quad \underbrace{(\mathbf{q}_d+\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r)^{T}\mathbf{F}_3(\mathbf{q}_d+\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r))}_{Gaussian-like}.
\end{aligned}
\end{equation}

In this distribution, we have the following theorem for parameter matrix $\mathbf{F}$, where the proof can be found in the Supplementary Material.
\begin{theorem}
    Considering the antipodally symmetric distribution (\ref{distributionfinal}), the sub-block matrix $\mathbf{F}_1\in \mathbb{R}^{4\times 4}$ is real symmetric, and $\mathbf{F}_3\in \mathbb{R}^{4\times 4}$ is real symmetric and negative definite.
\end{theorem}


\section{Symmetric Matrix F on SE(3)}\label{symmetric_matrixF}

This section gives a thorough analysis of the parameter matrix $\mathbf{F}$ in the unit dual quaternion distribution on $SE(3)$.


\subsection{Symmetric Matrix F}\label{subsec:Symmetric_matrix}
As shown in Equation~\eqref{distributionfinal}, $\mathbf{F}$ is decomposed into three sub-matrices, which are significant components of the Bingham-like distribution and the Gaussian-like distribution. 
\subsubsection{The Bingham-like Matrix}
The Bingham distribution was introduced \citep{Bingham1974AnAS} as an extension of the Gaussian distribution, which lies on the surface of the unit hypersphere,
\begin{equation}\label{Bingham}
    f(\mathbf{q}; \mathbf{M}, \mathbf{Z})=\frac{1}{N(\mathbf{Z})}\exp\left(\mathbf{q}^{T}\mathbf{M}\mathbf{Z}\mathbf{M}^T\mathbf{q}\right),
\end{equation}
where $\mathbf{Z}\in \mathbb{R}^{4 \times 4}$ is a diagonal matrix with an ascending entries $z_1 \leq z_2 \leq z_3 \leq z_4 \leq 0$, the matrix $\mathbf{M}\in \mathbb{R}^{4\times 4}$ is an orthogonal matrix, and $N(\mathbf{Z})$ is the normalization constant. Usually, we enforce the last entry of $\mathbf{Z}$ as a zero value using the property of the Bingham distribution, namely $f(\mathbf{q}_r;\mathbf{M},\mathbf{Z})=f(\mathbf{q}_r; \mathbf{M},\mathbf{Z}+c\mathbf{I})$, by setting $c=-z_4$. 

Here we show that the sub-vector $\mathbf{q}_r \in \mathbb{S}^{3}$ follows the Bingham distribution, the essential matrix $\mathbf{M}\in \mathbb{R}^{4\times 4}$ and $\mathbf{Z}\in \mathbb{R}^{4 \times 4}$ can be computed according to the Theorem~\ref{BinghamTheorem}, the proof can be found in the Supplementary Material.

\begin{theorem}\label{BinghamTheorem}
    The parameter matrix $\mathbf{F}\in \mathbb{R}^{8\times 8}$ is able to be decomposed into an orthogonal matrix $\mathbf{M}\in \mathbb{R}^{4\times 4}$ and a diagonal matrix $\mathbf{Z}\in \mathbb{R}^{4\times 4}$ via the eigendecomposition of $\mathbf{F}_1-\mathbf{F}_2\mathbf{F}_3^{-1}\mathbf{F}_2^{T}$.
\end{theorem}

\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.89\linewidth]{fig/QCQP.png}
    \caption{The differentiable QCQP for representing the rotation.} %The QCQP problem takes the parameter matrix $\mathbf{F}$ as input, and then the solution can be solved by computing the eigenvector corresponding to the minimum eigenvalue of the matrie $\mathbf{B}$.}
    \label{fig:rotation_representation}
\end{figure}


%\begin{figure}[htbp]
%    \centering
%    \includegraphics[width=0.41\textwidth]{fig/QCQP.png}
%    \caption{The differentiable QCQP for representing the rotation.} %The QCQP problem takes the parameter matrix $\mathbf{F}$ as input, and then the solution can be solved by computing the eigenvector corresponding to the minimum eigenvalue of the matrie $\mathbf{B}$.}
%    \label{fig:rotation_representation}
%\end{figure}


\subsubsection{The Gaussian-like Matrix}
Similarly, the parameter matrix $\mathbf{F}$ is also critical in the marginal distribution of $\mathbf{q}_d$ which is the Gaussian distribution with the mean vector being $-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r$ and covariance matrix being $-\frac{1}{2}\mathbf{F}_3^{-1}$,
\begin{equation}\label{Gaussian}
    \begin{aligned}
        &f(\mathbf{q}_d | \mathbf{q}_r) \propto \\
        &\quad \exp \left(\mathbf{q}_d-(-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r))^{T}\mathbf{F}_3(\mathbf{q}_d-(-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r)\right).
    \end{aligned}
\end{equation}



\subsection{F as the Intermediary Representation Space}\label{subsec:F_inter_space}

As $\mathbf{F}$ is an essential element of the developed unit dual quaternion distribution, in this subsection, we show that it can be considered as the intermediary representation space for regressing the pose.

\subsubsection{Rotation Representation}
For convenience, we set the upper case $\mathbf{B} = \mathbf{F}_1-\mathbf{F}_2\mathbf{F}_3^{-1}\mathbf{F}_2^{T}$, which is taken as the representation for rotation quaternions. According to the properties of matrix theory, the matrix $\mathbf{B}\in \mathbb{R}^{4\times 4}$ is real symmetric with a simple minimum eigenvalue,  which is written as 
\begin{align*}
\mathbf{B} = 
\begin{bmatrix}
    b_1 & b_2 & b_3 & b_4 \\
    b_2 & b_5 & b_6 & b_7 \\
    b_3 & b_6 & b_8 & b_9 \\
    b_4 & b_7 & b_9 & b_{10}
\end{bmatrix}.
\end{align*}

Then we regard the  computation of  the rotation quaternion $\mathbf{q}_r$ as an optimization problem which is defined as a quadratically-constrained quadratic program (QCQP) problem that arises in \citep{Yang2019AQC}. 

\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.96\linewidth]{fig/Translation_recover.png}
    \caption{The Gaussian process for representing the translation.} %The process takes the estimated rotation quaternions $\mathbf{q}_r$ and parameter matrix $\mathbf{F}$ as input, and then takes the mean vector $\mathbf{-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r}$ as estimated dual part of dual quaternions, with covariance matrix $-\frac{1}{2}\mathbf{F}_3^{-1}$ being the measurement of uncertainty. Finally, the translation vector can be recovered from dual part $\mathbf{q}_d$.}
    \label{fig:translation_representation}
\end{figure}

%\begin{figure}[htbp]
%    \centering
%    \includegraphics[width=0.48\textwidth]{fig/Translation_recover.png}
%    \caption{The Gaussian process for representing the translation.} %The process takes the estimated rotation quaternions $\mathbf{q}_r$ and parameter matrix $\mathbf{F}$ as input, and then takes the mean vector $\mathbf{-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r}$ as estimated dual part of dual quaternions, with covariance matrix $-\frac{1}{2}\mathbf{F}_3^{-1}$ being the measurement of uncertainty. Finally, the translation vector can be recovered from dual part $\mathbf{q}_d$.}
%    \label{fig:translation_representation}
%\end{figure}

\begin{definition}[QCQP problem]
    Let matrix $\mathbf{B}\in \mathbb{R}^{4\times 4}$ be a symmetric matrix, which can be parameterized with the vector $\mathbf{b}\in \mathbb{R}^{10}$. The QCQP problem related to $\mathbf{B}$ is shown in Figure \ref{fig:rotation_representation}, which is written as
    \begin{equation}\label{qcqp}
        \begin{aligned}
            &\min_{\mathbf{q}_r\in \mathbb{S}^{3}} \mathbf{q}_r^{T}\mathbf{B}\mathbf{q}_r \\
            & s.t.\quad \mathbf{q}_r^{T}\mathbf{q}_r = 1.
        \end{aligned}
    \end{equation}
\end{definition}

Note that the solution to this problem in Figure \ref{fig:rotation_representation} is to calculate the eigenvector corresponding to the minimum eigenvalue of $\mathbf{B}$. 



\begin{figure*}[htbp]
    \centering
    \includegraphics[width=0.94\linewidth]{fig/network.png}
    \caption{The proposed deep dual quaternion network structure. The letters (a), (b) correspond to the input image, and the proposed deep dual quaternion network. We start from the current frame input $I_{t}$ which is further fed into the proposed network. The output of the network is a vector $vec(\mathbf{F})$ with 36 elements which then consists of the matrix $\mathbf{F}$. After learning the parameter matrix $\mathbf{F}$, the rotation quaternions $\mathbf{q}_r$ and translation vector $\mathbf{t}$ are computed using the theory of dual quaternion distribution. During training, an implicit function theorem is applied to the backpropagation procedure.}
    \label{fig:network}
\end{figure*}



\subsubsection{Translation Representation}

The parameter matrix $\mathbf{F}$ also offers a new standpoint for representing the translation vector. As known, the marginal distribution of $\mathbf{q}_d$ is a Gaussian distribution which is shown in Equation~\eqref{Gaussian}.  Hence, the $\mathbf{q}_d$ can be represented by $\mathbf{m} = -\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r$, with the uncertainty being measured by the covariance matrix $\mathbf{G} = -\frac{1}{2}\mathbf{F}_3^{-1}$,  the related Gaussian process  is illustrated in Figure \ref{fig:translation_representation}. Finally the translation can be recovered from $\mathbf{q}_d$ using Equation~\eqref{translation_recover}.


\section{Deep Learning and F}\label{sec:dl_and_F}

In this section, we intend to develop a new probabilistic dual quaternion network to indirectly regress the pose. First, we show $\mathbf{F}$ is a smooth representation of the pose. Next, we develop a backpropagation method using an implicit function theorem. Subsequently, a new uncertainty metric is proposed to measure uncertainties. Finally, we give the structure of the proposed network.

\subsection{Smooth Feature of F}\label{subsec:smooth_feature}

A smooth representation for $SE(3)$ is important for learning-based methods when concerning the backpropagation procedure. Here we mainly consider the smooth feature of $\mathbf{F}$ on $SO(3)$ manifold since the translation can be computed after rotation quaternions are estimated. 

\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.96\linewidth]{fig/continuous_representation.png}
    \caption{A surjective map between the representation space and the original space.} %There exists a continuous representation in $\mathbf{B}$ and rotation quaternions $\mathbf{q}_r$.}
    \label{fig:smooth_representation}
\end{figure}

We utilize the concept of $\textit{continuous representation}$ presented in \citep{Zhou2019OnTC} to give a specific analysis to it. Considering the surjective map between the representation space and original space which is shown in Figure \ref{fig:smooth_representation}, we set the matrix $\mathbf{B}$ as the representation space and the rotation quaternions $\mathbf{q}_r$ as the original space.
\citet{Zhou2019OnTC} demonstrated that the mapping function $(f,h)$ is a representation if $f$ is a left inverse of $h$. Conversely, the representation is continuous if $h$ is continuous. From the solution of QCQP problem, we show that the original space $\mathbf{q}_r$ can be computed via the eigendecomposition of $\mathbf{B}$, namely $\mathbf{q}_r = f(\mathbf{B})$. And the representation space can be reversely deduced from the original space $\mathbf{q}_r$, namely $\mathbf{B}=h(\mathbf{q}_r)$.
Moreover, a continuous representation is possible if the dimension of the embedding space is greater than five. In this context, the representation space $\mathbf{B}$ can be simplified with the 10-dimensional vector $\mathbf{b}\in \mathbb{R}^{10}$. Then we introduce the $\textit{Smooth Global Section Theorem}$ to show that the representation space is a smooth and continuous mapping to $SO(3)$, where the proof can be found in \citep{peretroukhin_so3_2020}.

\begin{theorem}[Smooth Global Section]
    Consider the surjective map $f$: $\mathbb{R}^{10}\rightarrow SO(3)$ such that $f(\mathbf{B})$ returns the rotation matrix $\mathbf{R}$ defined by the two antipodal unit quaternions $\pm \mathbf{q}_r$ by minimizing the QCQP problem. There exists a smooth and global mapping, or section, $h$: $SO(3)\rightarrow \mathbb{R}^{10}$ such that $f(h(\mathbf{R}))=\mathbf{R}$.
\end{theorem}

\subsection{Gradient Computation}\label{subsec:gradient}

The relationship of $\mathbf{q}_r$ and data matrix $\mathbf{B}$ is defined in terms of an objective and constraints in a mathematical optimization problem. Importantly, the derivative of $\mathbf{q}_r$ with respect to data matrix $\mathbf{B}$ follows the implicit differentiation \citep{9355027}.

Regarding the continuousness of the proposed intermediary representation space $\mathbf{F}$, we in this paper introduce a gradient computation method for the backpropagation procedure. Recall that the matrix $\mathbf{B}\in \mathbb{R}^{4\times 4}$ is real symmetric, which can be simplified with a $10$-dimensional vector $\mathbf{b}=vec(\mathbf{B})$. \citet{Magnus1985OnDE} demonstrated that $\mathbf{q}_r$ will be differentiable at $\mathbf{b}$ provided that the minimum eigenvalue $\lambda_1$ of $\mathbf{B}$ is simple\footnote{We find that the non-simple minimum eigenvalue occurs rarely in our work.}. Hence, the gradient is implemented using the $\textit{implicit function theorem}$,
\begin{equation}\label{Gradient_compute}
    \frac{\partial \mathbf{q}_r}{\partial \mathbf{b}}=\mathbf{q}_r \otimes (\lambda_1\mathbf{I}-\mathbf{B})^{+},
\end{equation}
where $\otimes$ denotes the Kronecker product, $(\cdot)^{+}$ refers to the Moore-Penrose pseudo-inverse.

\subsection{Uncertainty Measurement}\label{subsec:uncertainty_measure}
The introduction of the uncertainty metric is of great significance to measure pose uncertainties. Intuitively, we consider the proposed unit dual quaternion distribution as the composition of the Bingham distribution $B(\mathbf{q}_r; \mathbf{B})$ and the Gaussian distribution $G(\mathbf{q}_d; \mathbf{-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r}, -\frac{1}{2}\mathbf{F}_3^{-1})$.  

For the rotation uncertainty, the Bingham belief is a proper choice \citep{peretroukhin_so3_2020}. According to the property of the Bingham distribution in Equation~\eqref{Bingham}, the rotation uncertainty is written as
%For rotation uncertainty, the Bingham belief is a proper choice\cite{Peretroukhin2020ASR},which is expressed as
\begin{equation}\label{uncertainty_rotation}
    U_q(\mathbf{Z}) = \sum_{i}^{4}z_i = z_1+z_2+z_3-3z_4,
\end{equation}
where $z_i\leq 0, i=1,2,3,4$.

Likewise, we also use the Gaussian belief to measure the translation uncertainty. Here we decompose the covariance matrix $\mathbf{G} = -\frac{1}{2}\mathbf{F}_3^{-1}$ using the eigendecomposition method, and then the translation uncertainty can be written as
\begin{equation}\label{uncertainty_trans}
    U_t(\mathbf{G})=\sum_{i}^{4}\lambda_i = \lambda_1+\lambda_2+\lambda_3+\lambda_4,
\end{equation}
where $\lambda_i, i=1,2,3,4$ denotes eigenvalues of $\mathbf{G}$.

\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.97\linewidth]{fig/NormF.png}
    \caption{The difference of the use of the normalization of $vec(\mathbf{F})$ during the learning phase. We take the ShopFacade scene in the Cambridge Landmarks dataset for example.
    The orange line indicates the pose error curves without normalizing the $vec(\mathbf{F})$. While the blue line denotes the pose error curves after normalizing the $vec(\mathbf{F})$. o.$vec(\mathbf{F}$)/w.$vec(\mathbf{F}$): without/with normalization of the $vec(\mathbf{F})$. }
    \label{fig:normf}
\end{figure}

\subsection{Network Structure}\label{subsec:network}
The structure of the proposed deep probabilistic dual quaternion network is shown in Figure~\ref{fig:network}. We set the ResNet-50 as the backbone network where the parameters are initialized from pre-trained ImageNet weights. Then two fully connected layer are appended to the ResNet's activations to regress the parameter matrix $\mathbf{F}$. Since the parameter matrix $\mathbf{F}$ is real symmetric, we can encode it with a $36$-dimensional feature vector.  Subsequently, we decompose the $\mathbf{F}$ with two branches, namely rotation branch and translation branch. For the rotation branch, we estimate rotation quaternions by solving the QCQP problem shown in Figure~\ref{fig:rotation_representation}. For the translation branch, we surprisingly find that directly using the estimated rotation quaternions is not sufficient to get a proper translation vector. In this case, a tiny trick is adopted that we normalize the output vector $vec(\mathbf{F})$ to generate a pseudo matrix $\mathbf{B'}$ which is shown in Figure~\ref{fig:normf}. Then another rotation quaternion is estimated to compose the mean vector $\mathbf{m}=\mathbf{-\mathbf{F}_3^{-1}\mathbf{F}_2^{T}\mathbf{q}_r}$. During the training stage, an implicit function theorem is applied to the backpropagation procedure. Finally, the translation vector $\mathbf{t}$ can be recovered from $\mathbf{m}$.


\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.97\linewidth]{fig/Cambridge/SevenScene.png}
    \caption{The rotation and translation error curves of each scene during test stages on the $7$-Scenes dataset. }
    \label{fig:test_curves_7scene}
\end{figure}

Additionally, we adopt the similar loss function presented in \citep{Kendall2015PoseNetAC} as our loss metric\footnote{Unit quaternions $-\mathbf{q}$ and $\mathbf{q}$ represent the same rotation. Hence the difference between two unit quaternions is further detailed as $\Vert\mathbf{q}_{g}-\mathbf{q}_{e}\Vert_2=\min (\Vert\mathbf{q}_{g}-\mathbf{q}_{e}\Vert_2, \Vert\mathbf{q}_{g}+\mathbf{q}_{e}\Vert_2)$.},
\begin{equation}\label{lossfunction}
    L = \Vert \mathbf{t}_{g}-\mathbf{t}_{e}\Vert_{2} + \alpha \Vert \mathbf{q}_{g}-\mathbf{q}_{e}\Vert_2,
\end{equation}
where $\mathbf{t}_{e}$ and $\mathbf{q}_e$ denote the inferred translation and rotation, while $\mathbf{t}_g$ and $\mathbf{q}_g$ are the labeled translation and rotation. Moreover, we set the scale factor $\alpha = 100$ on the $7$-Scenes dataset and $\alpha=300$ on the Cambridge Landmarks dataset.



\section{Experiments}\label{experiment}

In this section, we perform the absolute pose regression on the task of camera re-localization on two public datasets, namely Cambridge Landmarks \citep{Kendall2015PoseNetAC} and $7$-Scenes datasets \citep{Shotton2013SceneCR}, which consists of RGB frames with associated ground truth camera poses and provides training as well as test sequences. First, we give a comparison with state-of-the-art pose regression methods to demonstrate the inferred pose accuracy. %\footnote{The final results are measured by the median rotation error and the translation error.}.%, in which the final results are measured by the median rotation error and median translation error. 
Then we evaluate our deep probabilistic model on the noisy Cambridge Landmarks dataset to show its robustness in dealing with uncertainty and ambiguity.

\begin{table*}[htbp]
  \centering
  \caption{Evaluation on the $7$-Scenes dataset. The results are reported with the median translation error(m) and the median rotation error($^\circ$). The best results are in \textbf{bold}.}\label{tab:7scene}
  \scalebox{0.95}{
\begin{tabular}{cccccccc}
\hline
Scene         & Chess          & Fire            & Heads           & Office         & Pumpkin        & RedKitchen        & Stairs          \\ \hline
PoseNet       & 0.32m/8.12$^\circ$ & 0.47m/14.4$^\circ$ & 0.29m/12.0$^\circ$  & 0.48m/7.68$^\circ$ & 0.47m/8.42$^\circ$ & 0.59m/8.64$^\circ$ & 0.47m/13.8$^\circ$  \\
Dense PoseNet & 0.32m/6.60$^\circ$ & 0.47m/14.0$^\circ$  & 0.30m/12.2$^\circ$  & 0.48m/7.24$^\circ$ & 0.49m/8.12$^\circ$ & 0.58m/8.34$^\circ$ & 0.48m/13.1$^\circ$  \\
MapNet        & \textbf{0.08m}/3.25$^\circ$ & 0.27m/11.69$^\circ$ & 0.18m/13.2$^\circ$  & \textbf{0.17m}/5.15$^\circ$ & 0.22m/4.02$^\circ$ & 0.23m/4.93$^\circ$ & 0.30m/12.08$^\circ$ \\
MapNet++      & 0.10m/3.17$^\circ$ & \textbf{0.20m}/9.04$^\circ$  & 0.13m/11.1$^\circ$  & 0.18m/5.38$^\circ$ & \textbf{0.19m}/3.92$^\circ$ & 0.20m/5.01$^\circ$ & 0.30m/13.4$^\circ$  \\
BPN           & 0.37m/7.24$^\circ$ & 0.43m/13.7$^\circ$  & 0.31m/12.0$^\circ$  & 0.48m/8.04$^\circ$ & 0.61m/7.54$^\circ$ & 0.58m/7.54$^\circ$ & 0.48m/13.1$^\circ$  \\
VidLoc        & 0.18m/-       & 0.26m/-        & 0.14m/-        & 0.26m/-       & 0.36m/-       & 0.31m/-       & \textbf{0.26m}/-        \\
UBN           & 0.10m/4.97$^\circ$ & 0.27m/12.87$^\circ$ & \textbf{0.12m}/14.05$^\circ$ & 0.20m/7.52$^\circ$ & 0.23m/7.11$^\circ$ & \textbf{0.19m}/8.25$^\circ$ & 0.28m/13.1$^\circ$  \\
MBN-MB        & 0.10m/4.35$^\circ$ & 0.28m/11.86$^\circ$ & \textbf{0.12m}/12.76$^\circ$ & 0.19m/6.55$^\circ$ & 0.22m/6.9$^\circ$  & 0.21m/8.08$^\circ$ & 0.31m/9.98$^\circ$  \\
Ours          & 0.20m/\textbf{2.9$^\circ$}  & 0.30m/\textbf{5.63$^\circ$}  & 0.19m/\textbf{6.53$^\circ$}  & 0.30m/\textbf{3.51$^\circ$} & 0.28m/\textbf{2.6$^\circ$}  & 0.40m/\textbf{3.6$^\circ$}  & 0.42m/\textbf{6.23$^\circ$}  \\ \hline
\end{tabular}}
\end{table*}

\subsection{Training Details}
We run our experiments in the Pytorch framework \citep{paszke2019pytorch}. We use the Adam optimizer \citep{DBLP:journals/corr/KingmaB14} and begin with a learning rate of $10^{-4}$, and gradually decrease the learning rate exponentially with the multiplicative factor being $0.9$. We use a batch size of 16 and train for $100$ epochs for the $7$-Scenes dataset and $200$ epochs for the Cambridge Landmarks dataset. All input frames are resized to $224\times 224$.

\subsection{Results}
\begin{figure}[htbp]
    \centering
    \includegraphics[width=0.97\linewidth]{fig/Cambridge/Cambridge.png}
    \caption{The rotation and translation error curves of each scene during test stages on the Cambridge Landmarks dataset. }
    \label{fig:cambridge}
\end{figure}
\subsubsection{Normal Scenes}
\textbf{7-Scenes Dataset}. We test our model on all $7$ scenes on the $7$-Scenes dataset. Since the majority of the scenes do not show highly ambiguous environments, we regard them to be non-ambiguous. The final test curves of $7$ scenes can be found in Figure~\ref{fig:test_curves_7scene}. Clearly, the overall rotation error is less than $7^\circ$ and the overall translation error is less than $0.5$m.

In order to demonstrate the pose accuracy of our model, we make a comparison with other pose regression methods including PoseNet and its variant Dense PoseNet \citep{Kendall2015PoseNetAC}, MapNet and its variant MapNet++ \citep{brahmbhatt2018geometry}, BPN \citep{7487679}, VidLoc \citep{clark2017vidloc}, UBN and MBN-MB \citep{deng2020deep}. The quantitative and qualitative results are listed in Table~\ref{tab:7scene}.

\begin{table*}[htbp]
  \centering
  \caption{Evaluation on the Cambridge Landmarks dataset. The results are reported with the median translation error(m) and the median rotation error($^\circ$). The best results are in \textbf{bold}.}\label{tab:Cambridge}
  \scalebox{0.95}{
\begin{tabular}{ccccc}
\hline
Scene         & Kings College      & Hospital              & ShopFacade         & St.Mary Church     \\ \hline
PoseNet       & 1.92m/5.40$^\circ$ & 2.31m/5.38$^\circ$   & 1.46m/8.08$^\circ$ & 2.65m/8.48$^\circ$ \\
Dense PoseNet & 1.66m/4.86$^\circ$ & 2.62m/4.90$^\circ$   & 1.41m/7.18$^\circ$ & 2.45m/7.96$^\circ$ \\
MapNet        & 1.07m/1.89$^\circ$ & 1.94m/3.91$^\circ$   & 1.49m/4.22$^\circ$ & 2.0m/4.53$^\circ$  \\
BPN           & 1.74m/4.06$^\circ$ & 2.57m/5.12$^\circ$   & 1.25m/7.54$^\circ$  & 2.11m/8.38$^\circ$ \\
UBN           & 0.88m/1.77$^\circ$  & \textbf{1.93m}/3.71$^\circ$   & \textbf{0.8m}/4.74$^\circ$  & 1.84m/6.19$^\circ$ \\
MBN-MB        & \textbf{0.83m}/2.08$^\circ$ & 2.16m/3.64$^\circ$ & 0.92m/4.93$^\circ$ & \textbf{1.37m}/6.03$^\circ$ \\
Ours          & 1.20m/\textbf{0.84$^\circ$} & 2.46m/\textbf{1.72$^\circ$}   & 1.10m/\textbf{2.51$^\circ$}   & 2.40m/\textbf{2.63$^\circ$}   \\ \hline
\end{tabular}}
\end{table*}



%\begin{figure*}[htbp]
%    \centering
%    \includegraphics[width=0.8\textwidth]{fig/Cambridge/test_epoch.png}
%   \caption{The rotation and translation error curves of each scene during test stages on the Cambridge landmark dataset and $7$-Scenes dataset. }
%   \label{fig:test_curves}
%end{figure*}
From Table~\ref{tab:7scene}, the evaluated results on the $7$-Scenes dataset show that our method outperforms state-of-the-art methods on the rotation accuracy. But the translation accuracy performs a bit worse than baselines since the translation part is computed from the estimated rotation quaternion and the intermediary space $\mathbf{F}$, where the estimated rotation error can directly affect the translation accuracy and further amplify this error to the translation part. Despite this, our method still has a competitive advantage in terms of translation accuracy. A similar tendency also happens in the Cambridge Landmarks dataset.


%On the other hand, our method also achieves competitive results in terms of the translation accuracy.

\textbf{Cambridge Landmarks Dataset}. To further demonstrate the pose accuracy on different scenes, we also implement our approach on the Cambridge Landmarks dataset. We select the Kings College, Hospital, ShopFacade and St.Mary Church as our evaluation scenes. Again, we plot the four different test curves in Figure~\ref{fig:cambridge}. The final converge results show that the overall rotation error is less than 3$^\circ$ and overall translation error is no more than about $3$m.

Next, we also list our final pose accuracy in Table~\ref{tab:Cambridge} to make a comparison with state-of-the-art methods. Similar pose accuracy is reported that our method can achieve a more accurate pose in the Cambridge Landmarks dataset especially for the rotation part.

\subsubsection{Noisy Scenes}
To further demonstrate the performance of our method in dealing with uncertainty and ambiguity, we conduct our model on four noisy scenarios, namely the Kings College, Hospital, ShopFacade, and St.Mary Church, where these scenes are processed by manually adding the Gaussian blur kernel, randomly changing the brightness, contrast, saturation of all frames, and both to simulate different ambiguous environments.

%by manually adding the Gaussian blur kernel, randomly changing the brightness, contrast and saturation of frames and both to the Cambridge Landmark dataset.

Without retraining the proposed deep probabilistic dual quaternion network, we directly feed the processed frames into the trained model to predict the camera pose. The quantitative results are shown in Table~\ref{tab:noisy_scenes}. The results show that the pose errors have some minor changes in noisy scenes including the Kings College, Hospital, ShopFacade and St.Mary Church. 
\begin{table*}[htbp]
\centering
\caption{The quantitative results of the proposed deep probabilistic dual quaternion network in the noisy scenes on the Cambridge Landmarks dataset.}\label{tab:noisy_scenes}
\scalebox{0.96}{
\begin{tabular}{ccccc}
\hline
Scene              & Kings College      & Hospital              & ShopFacade         & St.Mary Church     \\ \hline
Normal             & 1.20m/0.84$^\circ$ & 2.46m/1.72$^\circ$   & 1.10m/2.51$^\circ$ & 2.4m/2.63$^\circ$  \\
Blur               & 1.42m/0.92$^\circ$  & 2.59m/1.74$^\circ$   & 1.12m/2.53$^\circ$ & 2.60m/2.65$^\circ$ \\
Brightness         & 1.55m/1.00$^\circ$ & 3.07m/2.14$^\circ$ & 1.30m/2.61$^\circ$ & 3.00m/2.60$^\circ$ \\
Blur \& Brightness & 1.76m/1.22$^\circ$ & 3.29m/2.31$^\circ$   & 1.38m/2.86$^\circ$ & 3.13m/2.69$^\circ$ \\ \hline
\end{tabular}}
\end{table*}

\begin{figure*}[htbp]
    \centering
    \includegraphics[width=0.97\linewidth]{fig/uncertainty/uncertainty.png}
    \caption{Uncertainty evaluation on the Cambridge Landmarks dataset. The left column shows the pose errors under the pose uncertainty metric in the blur environment, where the radius of the Gaussian blur kernel is $3.8$. The middle column shows the pose errors under the pose uncertainty metric in the random brightness change environment, where the maximum brightness factor is $0.6$, the maximum contrast factor is $0.6$ and the maximum saturation factor is $0.5$. The right column indicates pose errors under the both blur and brightness change environment. Note: we only plot the St.Mary Church scene, full information can be found in the Supplementary Material.}
    \label{fig:uncertainty}
\end{figure*}

Then we measure uncertainties of our model under normal and noisy environments, which is shown in Figure~\ref{fig:uncertainty}. Intuitively, the pose errors in the blur environment denoted by purple points have a similar distribution compared to that in the original environment denoted by the red points under the both rotation uncertainty and translation uncertainty measurement. Nevertheless, the pose errors in the brightness change environment denoted by blue points have some minor differences especially for the rotation uncertainty. Likewise, the pose errors in the blur and brightness change environment denoted by the orange points have a similar tendency, but we believe that it is reasonable in noisy environments. More importantly, there are only a few points out of the original distribution in aforementioned two scenes which have limited effects to the overall pose accuracy. Furthermore, the pose accuracy of our method in noisy environments still outperforms the state-of-the-arts that in normal environments especially for the rotation accuracy. As a result, the experiment results in noisy environments suggest that our model is robust to deal with uncertainty and ambiguity.

%Then we measure uncertainties of our model under normal and noisy environments, which is shown in Figure~\ref{fig:uncertainty}. Intuitively, the pose errors  in the only blur environment denoted by purple points and only brightness change environment denoted by blue points have a similar distribution compared to the pose errors in the normal environment indicated by red points under the rotation uncertainty measurement and translation uncertainty measurement. However, the rotation errors denoted by orange points in the blur and brightness change environment seem to be more uncertain than aforementioned environments, 

%but we believe that our model is robust to deal with uncertainty and ambiguity since the overall pose accuracy still has competitive advantages compared to state-of-the-art methods that in normal environments. 

\section{Conclusion}\label{sec:conclusion}
We design a deep probabilistic dual quaternion network that addresses the absolute pose regression problem on $SE(3)$. Unlike existing work, we take pose uncertainties into consideration by introducing an antipodally symmetric distribution over the unit dual quaternion on $SE(3)$. To address the mapping problem from the Euclidean space $\mathbb{R}^n$ to $SE(3)$ manifold, we present an intermediary differential representation space $\mathbf{F}$ as the output of our model to indirectly regress poses. Additionally, we introduce a backpropagation method for batch optimization. Experiment results on the camera re-localization task on the $7$-Scenes dataset and the Cambridge Landmarks dataset show that our method outperforms state-of-the-art methods on the pose accuracy. Moreover, extensive experiments on the noisy scenes on the Cambridge Landmarks dataset show that our method has the ability to deal with uncertainty and ambiguity.

In the future, we will explore the absolute pose regression problem leveraging our representation with a negative log-likelihood loss function to improve the reliability and robustness of our model in real-world applications. 


\begin{acknowledgements} % will be removed in pdf for initial submission,
                         % so you can already fill it to test with the
                         % ‘accepted’ class option
    This research is financially supported by the National Natural Science Foundation of China (No. 62072231), Fundamental Research Funds for the Central Universities (No. 14380079), and the Collaborative Innovation Center of Novel Software Technology and Industrialization. Jia Liu (jialiu@nju.edu.cn) and Lijun Chen (chenlj@nju.edu.cn) are the corresponding authors. 

    %\emph{All} acknowledgements go in this section.
\end{acknowledgements}

\bibliography{li_103}

\appendix

\end{document}
