\appendix
\newpage
\section{Implementation details}
\label{app:impl}

\paragraph{Backbone and optimization.}
Our implementation builds on a MonoGS-style 3D Gaussian mapping backbone with
the same differentiable rasterizer and optimizer.
Unless otherwise noted, we use the same hyperparameters for all sequences:
Adam with a fixed learning rate, per-primitive opacity and covariance
regularizers, and the depth and photometric losses described in
Section~\ref{sec:gaussian_map}.
We process frames in temporal order and run a fixed number of gradient steps
per keyframe; non-keyframes are used only to update the centerline, Bishop
frame, and coverage counters.

\paragraph{Hyperparameters.}
Table~\ref{tab:hyperparams} summarizes the key centerline, keyframing, and
prior parameters used throughout. Optimizer and batching details are given in
Appendix~\ref{app:protocol}.

\begin{table}[h]
\centering
\caption{Core hyperparameters, shared across all sequences.}
\setlength{\tabcolsep}{4pt}
\begin{tabular}{lr}
\toprule
\textbf{Parameter} & \textbf{Value} \\
\hline
Keyframe spacing $\tau_{\mathrm{KF}}$ (mm) & 20 \\
$\lambda_{\text{radial}}$, $\lambda_{\text{curv}}$ & 0.2,\ 0.02 \\
Robust penalty $\rho(\cdot)$ & Huber($\delta$) \\
$r_{\text{wall}}$ (mm) & 35 \\
\hline
Backbone spacing $d_{\min}$ (mm) & 5 \\
Max bend angle $\theta_{\max}$ (deg) & 120 \\
Spline degree & 3 (cubic) \\
Spline refit cadence $K_{\text{refit}}$ (points) & 5 \\
Centerline sampling step $\Delta s$ (mm) & 1 \\
\bottomrule
\end{tabular}
\label{tab:hyperparams}
\end{table}

\paragraph{Centerline update and thresholds.}
The backbone points used to define the centerline are updated online as in
Section~\ref{sec:centerline_bishop}.
We trigger a new backbone point when the camera center has moved at least
$d_{\min}$ along the trajectory, the incremental bend angle is below
$\theta_{\max}$, and the candidate is more than $d_{\text{loop}}$ from
non-neighbor points (loop avoidance).
We refit the centerline B-spline every $K_{\text{refit}}$ new backbone points
and resample at arc-length step $\Delta s$; values are listed in
Table~\ref{tab:hyperparams}. The refit is inexpensive compared to Gaussian
optimization.

\paragraph{Keyframing and continuous coverage updates.}
At each frame we update the arc-length accumulator $A_t$ and, when a keyframe
is triggered, we (i) add the current frame to the optimization set,
(ii) freeze the current coverage statistics for that frame, and (iii) reset
$A_t$.
Coverage counters are updated for every frame, not just keyframes: for the
current camera pose we identify the closest centerline sample and increment
coverage scores for bins within a small arc-length neighborhood of this
sample and within a viewing cone of the camera ray.
This allows coverage maps to be displayed continuously during the procedure
without waiting for optimization to converge.

\paragraph{Coverage view cone from intrinsics.}
The coverage viewing cone is defined by the camera intrinsics and image
bounds: a candidate surface direction is considered observable if it projects
inside the image under the intrinsics and lies within the
distance and radial-band thresholds listed in
Table~\ref{tab:coverage_params_app}. The effective angular extent is therefore
determined by the camera field of view, not a learned parameter.

\section{Experimental protocol and evaluation details}
\label{app:protocol}

\paragraph{Hardware and measurement.}
All methods are run on the same GPU with identical input streams (RGB,
predicted depth, and fixed poses). Runtime is reported as effective FPS over
full sequence wall-clock time.

\paragraph{Optimization and batching.}
We optimize per scene over the full sequence using only keyframes for
back-end Gaussian updates. All optimizer settings (learning rate, schedules,
and regularizer weights) are fixed across all sequences and are included in
the released configuration.

\section{Chamfer distance and alignment}
\label{app:chamfer}

Although both the reconstruction and the phantom CAD mesh are expressed in
millimeters, small residual calibration mismatches remain between coordinate
frames. Before computing surface error we perform a rigid alignment with
Iterative Closest Point (ICP), then compute a one-directional Chamfer
distance from the reconstructed surface to the aligned phantom mesh.

\paragraph{Rigid alignment.}
For each sequence we extract a reconstructed surface point cloud
$\mathcal{P}_{\text{traj}} = \{x_i\}$ from the Gaussian map and sample points
$\mathcal{P}_{\text{obj}} = \{y_j\}$ from the phantom OBJ mesh. We run rigid
ICP (point-to-plane) to find
\[
  T_{\text{ICP}}
  =
  \arg\min_{T \in SE(3)}
  \sum_i \mathrm{dist}\bigl(T x_i,\, \mathcal{P}_{\text{obj}}\bigr)^2,
\]
and transform all reconstruction points by $T_{\text{ICP}}$. Camera poses are
updated consistently so that rays still intersect the aligned phantom mesh.

\paragraph{Visible phantom surface.}
We restrict evaluation to the subset of the phantom surface actually
observable given the camera trajectories. For every pixel $(u,v)$ in each
frame with intrinsics $K$ and pose $T_t$ we: (i) back-project to a 3D ray
using $K^{-1}$, (ii) transform into world coordinates using $T_t$, and
(iii) compute the first intersection with the phantom mesh. Collecting all
successful intersections gives $\mathcal{P}_{\text{hit}} = \{z_k\}$, the set
of phantom points actually seen by the cameras. Rays are generated from the
input intrinsics $K$ with a fixed pixel stride of 2, applied identically
across all methods.

\paragraph{One-sided surface error.}
Our Chamfer distance is a one-sided RMS from the reconstructed surface to the
visible phantom surface:
\[
  \text{CD} =
  \sqrt{
    \frac{1}{|\mathcal{P}_{\text{traj}}|}
    \sum_{x_i \in \mathcal{P}_{\text{traj}}}
      \min_{z_k \in \mathcal{P}_{\text{hit}}}
      \bigl\| x_i - z_k \bigr\|_2^2
  },
\]
where $\mathcal{P}_{\text{traj}}$ denotes reconstruction points after applying
$T_{\text{ICP}}$. We do not compute the reverse direction because parts of the
phantom may never be visible; including them would penalize methods for failing
to reconstruct unseen surfaces. All CD values are in millimeters using this
definition with identical settings for all methods. Where evaluation is
restricted to a radial band $r \in [r_{\min}, r_{\max}]$, the same band is
applied consistently to both sampling and evaluation.

\section{Coverage oracle and metrics}
\label{app:coverage_oracle}

To assess the accuracy of our online geometric coverage scores we construct a
visibility oracle from the phantom mesh and ground-truth poses, using the same
camera intrinsics $K$ and thus the same viewing angles.

\paragraph{Coverage parameters.}
Table~\ref{tab:coverage_params_app} lists the thresholds and binning used to
compute the unrolled $(s,\theta)$ coverage map and segment summaries.

\begin{table}[t]
\centering
\caption{Geometric coverage parameters used for
Fig.~\ref{fig:coverage} and all reported coverage summaries.}
\setlength{\tabcolsep}{4pt}
\begin{tabular}{lr}
\toprule
\textbf{Parameter} & \textbf{Value} \\
\hline
View cone & camera frustum from $K$ \\
Max distance $d_{\max}$ (mm) & 100 \\
$r_{\text{wall}}$ (mm) & 35 \\
Arc-length neighborhood $\Delta s_{\text{cov}}$ (mm) & 20 \\
Radial band for wall evidence (mm) & $[25, 50]$ \\
$s$ binning & normalized by arc length \\
$\theta$ binning & 72 bins ($5^\circ$ each) \\
Per-frame update & $+1$ per visible Gaussian, $r\!\in\![25,50]$,
depth $\le d_{\max}$ \\
$\theta$ wrap-around & circular bins \\
\bottomrule
\end{tabular}
\label{tab:coverage_params_app}
\end{table}

\paragraph{Online coverage scores.}
At each frame we identify a local arc-length neighborhood around the camera's
closest centerline coordinate and increment $(s,\theta)$ bins whose associated
geometry lies within the distance and radial-band thresholds and projects
inside the image under $K$. For comparison to the oracle we apply the
normalization and thresholding rules in Table~\ref{tab:coverage_params_app}.
These scores capture geometric coverage (pose- and geometry-based visibility)
and do not directly measure mucosal visualization quality under blur, debris,
specularities, or fold occlusions.
\begin{table*}[h]
\centering
\caption{Per-sequence results on C3VD phantom data (ground-truth poses).}
\setlength{\tabcolsep}{4pt}
\begin{tabular}{lcrrrrrr}
\toprule
\textbf{Method} &\textbf{ Seq.} & \textbf{PSNR} $\uparrow$ & \textbf{SSIM} $\uparrow$ &
\textbf{FPS} $\uparrow$ & \textbf{CD} $\downarrow$ & \textbf{\# points} \\
\hline
EndoGSLAM & v1 & 11.14 & 0.358 & 1.30 & 6.61 & 2\,579\,787 \\
          & v2 & 11.35 & 0.368 & 0.94 & 7.68 & 3\,306\,894 \\
          & v3 & 11.70 & 0.354 & 0.92 & 7.27 & 3\,305\,803 \\
          & v4 & 11.08 & 0.302 & 1.15 & 9.74 & 3\,261\,161 \\
\hline
MonoGS    & v1 & 12.13 & 0.398 & 9.11 & 7.76 &   624\,033 \\
          & v2 & 10.86 & 0.307 & 8.26 & 7.18 &   520\,626 \\
          & v3 & 11.40 & 0.292 & 7.58 & 8.38 &   668\,002 \\
          & v4 & 10.64 & 0.282 & 7.84 & 8.30 &   545\,336 \\
\hline
Ours      & v1 & 12.66 & 0.377 & 7.28 & 5.60 &   993\,965 \\
          & v2 & 10.93 & 0.327 & 6.24 & 4.99 & 1\,430\,093 \\
          & v3 & 11.98 & 0.379 & 6.69 & 6.35 &   959\,687 \\
          & v4 & 10.69 & 0.259 & 6.70 & 5.96 & 1\,163\,906 \\
\bottomrule
\end{tabular}
\label{tab:seq_results}
\end{table*}

\section{Per-sequence quantitative results}
\label{app:perseq}

Figure~\ref{fig:pc_all} shows qualitative reconstructions across all four
phantom sequences, complementing the single-sequence comparison in
Figure~\ref{fig:qualitative}. Table~\ref{tab:seq_results} reports per-sequence PSNR, SSIM, FPS, Chamfer
distance, and number of active Gaussians for all methods.


\begin{figure}[h]
    \centering
    \includegraphics[width=\linewidth]{figures/pc_all.pdf}
    \caption{Qualitative reconstructions on all four C3VD phantom sequences.
    Gaussians are colored by radial distance from the centerline. Across
    sequences, Gaussians are consistently concentrated in a thin band around
    the colon wall with few interior points, illustrating the effect of the
    tubular prior and centerline-aware keyframing.}
    \label{fig:pc_all}
\end{figure}



\section{Robustness to pose noise}
\label{app:qual_noise}

To illustrate centerline behavior under pose noise we add synthetic
perturbations to the ground-truth trajectories and recompute the centerline.
We test two noise regimes: high-frequency local jitter, which the B-spline
smoothing largely suppresses, and a slowly accumulating global bias, which
displaces the centerline accordingly since our method does not attempt to
correct systematic drift. Figure~\ref{fig:noisy_traj} shows representative
results. In practice the centerline acts as a low-pass filter over camera
motion and tolerates realistic tracker noise, but depends on a globally
reasonable pose estimate from the upstream tracking system.

\begin{figure}[h]
    \centering
    \includegraphics[width=\linewidth]{figures/noisy_traj.pdf}
    \caption{Effect of pose noise (N in mm) on the estimated centerline. Left:
    ground-truth trajectory and centerline. Middle: trajectory with
    high-frequency local jitter; the B-spline centerline remains smooth and
    close to the original. Right: trajectory with a slowly accumulating global
    bias; the centerline is displaced accordingly, as our method does not
    correct systematic drift.}
    \label{fig:noisy_traj}
\end{figure}

\section{Bishop frame construction}
\label{app:bishop}

For completeness we summarize the discrete Bishop-frame computation; the main
text gives only the high-level description.

Given sampled centerline points $\{C_m\}_{m=0}^{M}$ with arc-length parameters
$\{s_m\}$, we estimate unit tangents $T_m$ by centered finite differences
\[
T_m =
\frac{C_{m+1} - C_{m-1}}
     {\|C_{m+1} - C_{m-1}\|_2},
\qquad m=1,\dots,M-1,
\]
with one-sided differences at the endpoints. At $m=0$ we choose an initial
normal $N_{1,0}$ orthogonal to $T_0$ via
\[
  a_{\text{ref}} =
  \begin{cases}
    (0,1,0)^\top, & \text{if } |T_0 \cdot (0,0,1)^\top| > 0.9,\\[2pt]
    (0,0,1)^\top, & \text{otherwise,}
  \end{cases}
  \qquad
  N_{1,0} = \frac{T_0 \times a_{\text{ref}}}{\|T_0 \times a_{\text{ref}}\|_2},
  \quad
  N_{2,0} = \frac{T_0 \times N_{1,0}}{\|T_0 \times N_{1,0}\|_2}.
\]

For $m=1,\dots,M$ we propagate normals by discrete parallel transport. Let
\[
  a_m = T_{m-1} \times T_m, \qquad
  \theta_m = \operatorname{atan2}(\|a_m\|_2,\, T_{m-1} \cdot T_m),
\]
with unit axis $\hat a_m = a_m / \|a_m\|_2$ when $\|a_m\|_2 > \epsilon$
and $\hat a_m = T_m$ otherwise ($\epsilon = 10^{-6}$). We rotate $N_{1,m-1}$
by Rodrigues' formula
\[
  R(\hat a_m,\theta_m) =
  I + \sin\theta_m [\hat a_m]_\times
    + (1-\cos\theta_m) [\hat a_m]_\times^2,
\]
and set
\[
  N_{1,m} = R(\hat a_m,\theta_m)\, N_{1,m-1}, \qquad
  N_{2,m} = \frac{T_m \times N_{1,m}}{\|T_m \times N_{1,m}\|_2}.
\]

\paragraph{Sign-consistency check.}
To prevent sign flips that would invert the $\theta$ axis, we check
$\langle N_{1,m}, N_{1,m-1}\rangle$: if negative, we flip both normals
$(N_{1,m}, N_{2,m}) \leftarrow (-N_{1,m}, -N_{2,m})$.

This discrete Bishop frame satisfies orthonormality up to numerical precision
and minimizes twist along the curve, avoiding the instability of Frenet frames
in nearly straight segments. In practice we re-orthogonalize by a single
Gram--Schmidt step every few samples at negligible cost.
