\section*{Appendix Overview}

This appendix contains additional dataset characteristics (Appendix~\ref{appendix:data-details}), implementation details for anatomical labeling including the IPGN architecture schematic (Appendix~\ref{appendix:label-implementation}), segmentation training details (Appendix~\ref{appendix:seg-training}), clinical evaluation criteria (Appendix~\ref{appendix:clinical-evaluation}), and the complete set of quantitative and qualitative evaluation results (Appendix~\ref{chap:plainresults}). These materials are provided to support reproducibility and to complement the descriptions in Section~\ref{sec:methods}.




\section{Dataset Details and Limitations}
\label{appendix:data-details}

This appendix summarizes additional characteristics of the datasets used in PaSAL and their implications for generalizability.

\subsection*{Image formats and matrix sizes}

The public HiPaS and PTL datasets were provided as \texttt{.npz} volumes (with additional \texttt{.graphml} files for PTL graphs), while the in-house Amsterdam UMC cohort was available as \texttt{.nii.gz} (NIfTI) files. In our pipeline, most scans have matrix sizes of $N \times 512 \times 512$, with a smaller number at $N \times 718 \times 718$, $N \times 768 \times 768$, or $N \times 1024 \times 1024$. These matrix sizes reflect the stored image dimensions used for model training and inference, and should not be interpreted as acquisition resolution.

\subsection*{Missing acquisition metadata}

Because all datasets were provided as NIfTI/NPZ and not as DICOM, several acquisition parameters are unavailable:

\begin{itemize}
    \item Slice thickness and slice-to-slice spacing (increment) are not encoded, which prevents stratified analyses of model performance as a function of through-plane resolution.
    \item The scanning protocol (e.g.\ non-contrast CT vs.\ CTPA) is not recorded in the files, so we cannot quantify performance differences between contrast-enhanced and non-contrast scans.
\end{itemize}

For HiPaS, a separate metadata file with voxel spacing values was available and was used for reporting spacing distributions, but thickness and increment remain unknown.

\subsection*{Population, pathology, and class imbalance}

HiPaS consists primarily of Chinese patients and includes cases with a range of pulmonary pathologies. PTL provides derived vascular and airway trees without demographic metadata. The in-house Amsterdam UMC cohort contains lung cancer patients scanned before and after radiotherapy. Across all datasets:

\begin{itemize}
    \item Vessel voxels form a very small fraction of the volume, leading to strong class imbalance between background and vessel classes.
    \item Pathology (e.g.\ pulmonary embolism, lung tumours, post-radiotherapy changes) is present to varying degrees.
\end{itemize}

These factors can bias quantitative metrics and should be kept in mind when interpreting results and assessing generalizability to new populations.


\section{Anatomical Labeling Implementation Details}
\label{appendix:label-implementation}

This appendix provides additional implementation details for the anatomical labeling component based on the IPGN framework~\cite{xie2025efficient}.

\begin{figure}[ht]
    \centering
    \includegraphics[width=0.9\linewidth]{Figures/Methodology/IPGN framework.png}
    \caption{Schematic overview of the Implicit Point-Graph Network (IPGN) architecture for pulmonary tree labeling, created by Xie et al.~\cite{xie2025efficient}. The network takes an extracted vessel segmentation (a), skeleton graph (b) and sampled point cloud (c) as input, encodes point and graph features, fuses them, and predicts anatomical labels at graph, point and voxel level.}
    \label{appendix:IPGN-framework}
\end{figure}

\subsection*{Graph extraction for labeling}
\label{appendix:label-preproc}

Vessel skeletons are computed from the dense segmentation volumes using the thinning and MST-style reconnection procedures described in Section~\ref{sec:meth_segmentation_short}. Node coordinates are stored in physical space (voxel centers), and edges are defined between skeleton neighbours, resulting in a connected vessel graph for each artery and vein tree. These graphs are exported in \texttt{.graphml} format and paired with the corresponding binary segmentations, forming the inputs required by the pre-trained IPGN models.


\subsection*{Label propagation to peripheral vessels}
\label{appendix:label-propagation}

As discussed in the main text, the anatomical labeling model is limited to the extent of the original PTL segmentations (Level~$[A^3, V^3]$). Our extended vessel masks (Level~$[A^4, V^4]$) therefore contain unlabeled distal regions that are not directly covered by IPGN predictions.

To obtain fully labeled vascular trees for qualitative analysis, we applied a marker-based watershed algorithm over the Level~4 segmentations. IPGN voxel-level predictions at Level~3 served as seed markers, and labels were propagated to the remaining vessel voxels following the underlying distance transform. In practice, this produces anatomically plausible labels for most peripheral branches.

Because no ground truth labels are available for these distal regions, the propagated labels are used \emph{only} for visualization and qualitative inspection. They are not used for supervised training and are excluded from all quantitative anatomical labeling metrics reported in the Results section.


\section{Segmentation Training Details}
\label{appendix:seg-training}

All nnU-Net models were trained in 3D full-resolution mode using the default
nnU-Net training pipeline~\cite{isensee2021nnu}. Key settings were:

\begin{itemize}
    \item \textbf{Input configuration:} Multi-channel inputs as described in Section~\ref{sec:meth_segmentation_short} (CT, Frangi vesselness, and previous-level artery/vein predictions where applicable).
    \item \textbf{Network and loss:} 3D full-resolution nnU-Net with the default combined Dice + cross-entropy loss.
    \item \textbf{Optimization:} Default nnU-Net optimizer and learning-rate schedule (stochastic gradient descent with Nesterov momentum and polynomial learning-rate decay).
    \item \textbf{Data augmentation:} nnU-Net's standard on-the-fly augmentations, including rotations, scaling, intensity transformations, Gaussian noise, blurring, and simulated low resolution.
    \item \textbf{Training regime:} Training performed separately for arteries and veins at each hierarchical level; only fold~0 was trained (instead of full 5-fold cross-validation) due to computational constraints, and the best checkpoint was selected based on validation loss.
\end{itemize}

No manual hyperparameter tuning beyond nnU-Net defaults was applied, apart from enabling multi-channel input for the salience-transmission setup.


\clearpage

\section{Clinical Evaluation Criteria}
\label{appendix:clinical-evaluation}

\subsection*{Instructions for Clinical Expert - Segmentation}

For each CT scan, please evaluate the segmentation output using the following three categories. Assign a score from 0 to 5 for each, where:

\begin{itemize}
    \item 0 = Very poor
    \item 1 = Poor
    \item 2 = Fair
    \item 3 = Good
    \item 4 = Excellent
    \item 5 = Flawless
\end{itemize}


\subsubsection*{Evaluation Criteria}

\noindent\textbf{1. Segmentation Accuracy and Robustness}
\begin{itemize}
    \item Are major vessels correctly segmented?
    \item Are there errors such as false splits or missing vessels?
    \item Is the segmentation reliable throughout the scan?
\end{itemize}

\noindent\textbf{2. Vessel Branch Abundance}
\begin{itemize}
    \item Are sufficient peripheral branches captured?
    \item Is the segmentation too conservative or excessively noisy?
    \item Does it reflect the expected vascular complexity?
\end{itemize}

\noindent\textbf{3. Diagnostic Assistance}
\begin{itemize}
    \item Could the segmentation help in diagnostic tasks (e.g., treatment planning)?
    \item Does it provide meaningful anatomical insight?
    \item Would it save time or effort in clinical workflows?
\end{itemize}

\subsubsection*{Form to Fill Out}
\begin{center}
\small
\begin{tabular}{lccc}
\toprule
\textbf{Scan ID} &
\textbf{Seg. accuracy \& robustness} &
\textbf{Branch abundance} &
\textbf{Diagnostic assistance} \\
\midrule
\_\_\_\_\_ & \_\_\_\_ / 5 & \_\_\_\_ / 5 & \_\_\_\_ / 5 \\
\bottomrule
\end{tabular}
\end{center}



\vspace{1cm}

\subsection*{Instructions for Clinical Expert - Anatomical Labeling}

Same instructions and scoring scale as above.

\subsubsection*{Evaluation Criteria}

\noindent\textbf{1. Label Consistency Across Branches}
\begin{itemize}
    \item Are connected vessel branches labeled coherently without unexpected label switches?
    \item Is anatomical continuity respected throughout bifurcations?
    \item Do labels remain stable across visually continuous regions?
\end{itemize}

\noindent\textbf{2. Correctness of Proximal vs. Distal Labeling}
\begin{itemize}
    \item Are central (proximal) vessels labeled distinctly from peripheral (distal) ones?
    \item Does the labeling follow expected anatomical hierarchies (e.g., lobar, segmental vessels)?
    \item Are abrupt or incorrect zone transitions avoided?
\end{itemize}

\noindent\textbf{3. Usefulness for Clinical Interpretation}
\begin{itemize}
    \item Does the labeling facilitate interpretation of anatomical regions?
    \item Could it assist in identifying perfusion territories, surgical zones, or radiation targets?
    \item Would this labeling support clinical tasks such as reporting, navigation, or planning?
\end{itemize}

\subsubsection*{Form to Fill Out}
\begin{center}
\small
\begin{tabular}{lccc}
\toprule
\textbf{Scan ID} &
\textbf{Label consistency} &
\textbf{Proximal vs. distal} &
\textbf{Clin. interpretability} \\
\midrule
\_\_\_\_\_ & \_\_\_\_ / 5 & \_\_\_\_ / 5 & \_\_\_\_ / 5 \\
\bottomrule
\end{tabular}
\end{center}



\vspace{1cm}
\newpage


\subsection*{Instructions for Clinical Expert - Full Pipeline}

Same instructions and scoring scale as above.

\subsubsection*{Evaluation Criteria}

\noindent\textbf{1. Anatomical Completeness and Accuracy}
\begin{itemize}
    \item Are both large and small vessel branches well represented and correctly segmented?
    \item Do the anatomical labels match expected vascular structures and hierarchies?
    \item Is there good alignment between the segmentation and the anatomical labeling?
\end{itemize}

\noindent\textbf{2. Consistency and Plausibility of Labeling}
\begin{itemize}
    \item Are labeled vessels anatomically consistent throughout the scan (e.g., no abrupt label changes)?
    \item Is labeling coherent across bifurcations and throughout vascular trees?
    \item Do labels correspond to known anatomical territories (e.g., lobes, segments)?
\end{itemize}

\noindent\textbf{3. Clinical Utility}
\begin{itemize}
    \item Could this combined output assist in clinical tasks such as treatment planning, surgical preparation, or diagnostic interpretation?
    \item Does the visual output support clinical reasoning and decision-making?
    \item Would it save time, reduce effort, or add value in a clinical workflow?
\end{itemize}

\subsubsection*{Form to Fill Out}
\begin{center}
\small
\begin{tabular}{lccc}
\toprule
\textbf{Scan ID} &
\textbf{Anat. completeness} &
\textbf{Label plausibility} &
\textbf{Clinical utility} \\
\midrule
\_\_\_\_\_ & \_\_\_\_ / 5 & \_\_\_\_ / 5 & \_\_\_\_ / 5 \\
\bottomrule
\end{tabular}
\end{center}


\clearpage

\section{Detailed Evaluation Results} \label{chap:plainresults}

This appendix provides the complete set of raw evaluation results corresponding to the experiments presented in Section \ref{chap:results}. The tables are organized in the same order as the main Results section, and include both quantitative and qualitative assessments. These data are included to facilitate full transparency and reproducibility, and to allow other researchers to recreate or further analyze the presented plots.

% Segmentation
\subsection{Segmentation Results}

\subsubsection{Quantitative Evaluation (HiPaS Test Set)}

The per-subject quantitative segmentation metrics for the HiPaS test set are reported in Table~\ref{tab:seg-hipas-quant}.

\begin{table}
\centering
\caption{Per-subject quantitative segmentation metrics on the HiPaS test set.}
\label{tab:seg-hipas-quant}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lrrrrrrrrrrrr}
\toprule
Subject & Dice (Artery) & HD95 (Artery) & Hausdorff (Artery) & Jaccard (Artery) & Precision (Artery) & Sensitivity (Artery) & Dice (Vein) & HD95 (Vein) & Hausdorff (Vein) & Jaccard (Vein) & Precision (Vein) & Sensitivity (Vein) \\
\midrule
007 & 0.92 & 1.41 & 31.59 & 0.85 & 0.89 & 0.95 & 0.89 & 5.83 & 33.00 & 0.81 & 0.85 & 0.94 \\
027 & 0.90 & 1.41 & 22.38 & 0.82 & 0.89 & 0.91 & 0.90 & 3.32 & 35.13 & 0.82 & 0.90 & 0.91 \\
029 & 0.89 & 1.00 & 22.00 & 0.80 & 0.83 & 0.96 & 0.91 & 2.24 & 46.05 & 0.84 & 0.89 & 0.94 \\
036 & 0.88 & 1.41 & 35.74 & 0.79 & 0.82 & 0.95 & 0.90 & 25.08 & 76.07 & 0.82 & 0.89 & 0.91 \\
058 & 0.94 & 1.00 & 95.68 & 0.88 & 0.94 & 0.93 & 0.91 & 1.41 & 84.08 & 0.83 & 0.93 & 0.89 \\
063 & 0.92 & 1.00 & 36.19 & 0.86 & 0.91 & 0.93 & 0.89 & 3.61 & 78.55 & 0.81 & 0.88 & 0.90 \\
071 & 0.93 & 1.41 & 104.93 & 0.87 & 0.94 & 0.92 & 0.87 & 8.19 & 110.66 & 0.77 & 0.82 & 0.93 \\
164 & 0.89 & 6.00 & 288.38 & 0.81 & 0.91 & 0.87 & 0.89 & 1.41 & 25.63 & 0.81 & 0.94 & 0.85 \\
174 & 0.89 & 8.49 & 45.22 & 0.81 & 0.89 & 0.90 & 0.90 & 6.40 & 192.15 & 0.82 & 0.93 & 0.87 \\
189 & 0.88 & 11.18 & 221.41 & 0.79 & 0.88 & 0.89 & 0.86 & 11.22 & 41.00 & 0.75 & 0.83 & 0.88 \\
190 & 0.87 & 4.12 & 74.05 & 0.77 & 0.81 & 0.94 & 0.82 & 5.66 & 95.12 & 0.70 & 0.73 & 0.94 \\
247 & 0.88 & 5.00 & 31.83 & 0.79 & 0.90 & 0.86 & 0.88 & 6.93 & 30.48 & 0.79 & 0.93 & 0.84 \\
\bottomrule
\end{tabular}
}
\end{table}



\subsubsection{Qualitative Evaluation (HiPaS Test Set)}

The per-subject qualitative expert ratings for the HiPaS test set are summarized in Table~\ref{tab:seg-hipas-qual}.


\begin{table}
\centering
\caption{Per-subject qualitative segmentation scores on the HiPaS test set.}
\label{tab:seg-hipas-qual}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lllllllll}
\toprule
Scan & Diagnostic Assistance (Artery) & Mean Score (Artery) & Segmentation Accuracy and Robustness (Artery) & Vessel Branch Abundance (Artery) & Diagnostic Assistance (Vein) & Mean Score (Vein) & Segmentation Accuracy and Robustness (Vein) & Vessel Branch Abundance (Vein) \\
\midrule
007 & 4 & 4.33 & 5 & 4 & 5 & 4.33 & 4 & 4 \\
027 & 4 & 3.67 & 3 & 4 & 4 & 3.67 & 3 & 4 \\
029 & 5 & 4.67 & 4 & 5 & 5 & 4.67 & 4 & 5 \\
036 & 4 & 3.67 & 3 & 4 & 4 & 3.33 & 3 & 3 \\
058 & 4 & 3.67 & 3 & 4 & 3 & 3.33 & 3 & 4 \\
063 & 3 & 3.33 & 4 & 3 & 3 & 3.33 & 4 & 3 \\
071 & 4 & 3.67 & 3 & 4 & 4 & 3.67 & 3 & 4 \\
164 & 4 & 4.00 & 4 & 4 & 4 & 4.00 & 4 & 4 \\
174 & 4 & 4.00 & 4 & 4 & 4 & 4.00 & 4 & 4 \\
\bottomrule
\end{tabular}
}
\end{table}

\subsubsection{Comparison of Quantitative and Qualitative Evaluation}

The correlation between quantitative segmentation metrics and qualitative expert scores on the HiPaS test set is given in Table~\ref{tab:seg-hipas-corr}.


\begin{table}
\centering
\caption{Correlation between quantitative segmentation metrics and qualitative expert scores on the HiPaS test set.}
\label{tab:seg-hipas-corr}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lllrl}
\toprule
Metric & Structure & ExpertCategory & Correlation & P-value \\
\midrule
Dice & Artery & Segmentation Accuracy and Robustness & -0.17 & 6.55e-01 \\
Dice & Artery & Vessel Branch Abundance & -0.46 & 2.17e-01 \\
Dice & Artery & Diagnostic Assistance & -0.46 & 2.17e-01 \\
Dice & Artery & Mean Score & -0.45 & 2.19e-01 \\
Dice & Vein & Segmentation Accuracy and Robustness & -0.09 & 8.25e-01 \\
Dice & Vein & Vessel Branch Abundance & 0.52 & 1.53e-01 \\
Dice & Vein & Diagnostic Assistance & 0.19 & 6.18e-01 \\
Dice & Vein & Mean Score & 0.21 & 5.81e-01 \\
Sensitivity & Artery & Segmentation Accuracy and Robustness & 0.14 & 7.25e-01 \\
Sensitivity & Artery & Vessel Branch Abundance & 0.37 & 3.34e-01 \\
Sensitivity & Artery & Diagnostic Assistance & 0.37 & 3.34e-01 \\
Sensitivity & Artery & Mean Score & 0.25 & 5.10e-01 \\
Sensitivity & Vein & Segmentation Accuracy and Robustness & -0.09 & 8.25e-01 \\
Sensitivity & Vein & Vessel Branch Abundance & 0.21 & 5.89e-01 \\
Sensitivity & Vein & Diagnostic Assistance & 0.65 & 6.04e-02 \\
Sensitivity & Vein & Mean Score & 0.32 & 4.07e-01 \\
Precision & Artery & Segmentation Accuracy and Robustness & -0.31 & 4.16e-01 \\
Precision & Artery & Vessel Branch Abundance & -0.37 & 3.34e-01 \\
Precision & Artery & Diagnostic Assistance & -0.37 & 3.34e-01 \\
Precision & Artery & Mean Score & -0.44 & 2.39e-01 \\
Precision & Vein & Segmentation Accuracy and Robustness & 0.09 & 8.25e-01 \\
Precision & Vein & Vessel Branch Abundance & 0.09 & 8.19e-01 \\
Precision & Vein & Diagnostic Assistance & -0.26 & 5.02e-01 \\
Precision & Vein & Mean Score & -0.02 & 9.65e-01 \\
HD95 & Artery & Segmentation Accuracy and Robustness & -0.14 & 7.18e-01 \\
HD95 & Artery & Vessel Branch Abundance & 0.00 & 1.00e+00 \\
HD95 & Artery & Diagnostic Assistance & 0.00 & 1.00e+00 \\
HD95 & Artery & Mean Score & -0.27 & 4.91e-01 \\
HD95 & Vein & Segmentation Accuracy and Robustness & 0.22 & 5.74e-01 \\
HD95 & Vein & Vessel Branch Abundance & 0.46 & 2.13e-01 \\
HD95 & Vein & Diagnostic Assistance & -0.16 & 6.77e-01 \\
HD95 & Vein & Mean Score & 0.17 & 6.67e-01 \\
\bottomrule
\end{tabular}
}
\end{table}

The same relationships are visualized in Fig.~\ref{fig:segmentation_correlation_appendix} and Fig.~\ref{fig:segm_scatter_vs_expert_appendix}.

\begin{figure}[ht]
\centering
\includegraphics[width=\linewidth]{Figures/Results/segmentation_correlation_heatmap.pdf}
\caption{Spearman correlation between quantitative segmentation metrics and expert-assigned scores across the 9 overlapping HiPaS scans, shown separately for arteries (left) and veins (right).}
\label{fig:segmentation_correlation_appendix}
\end{figure}

\begin{figure}[ht]
  \centering
  \includegraphics[width=\linewidth]{Figures/Results/segmentation_relationships_mean.pdf}
  \caption{Per-scan relationships between segmentation metrics and expert mean score for arteries and veins. Left: Dice vs.\ expert mean. Right: HD95 vs.\ expert mean (lower is better).}
  \label{fig:segm_scatter_vs_expert_appendix}
\end{figure}


\subsection{Anatomical Labeling Results}

\subsubsection{Quantitative Evaluation on the PTL Test Set}

Per-subject anatomical labeling accuracy on the PTL test set at the edge, node, and voxel levels is reported in Table~\ref{tab:ptl-quant}.


{\scriptsize
\begin{longtable}{lrrrrrr}
\caption{Per-subject anatomical labeling accuracy on the PTL test set at edge, node, and voxel levels.}
\label{tab:ptl-quant} \\
\toprule
\textbf{Subject} & Edge (Artery) & Node (Artery) & Voxel (Artery) & Edge (Vein) & Node (Vein) & Voxel (Vein) \\
\midrule
\endfirsthead

\toprule
\textbf{Subject} & Edge (Artery) & Node (Artery) & Voxel (Artery) & Edge (Vein) & Node (Vein) & Voxel (Vein) \\
\midrule
\endhead

\midrule
\multicolumn{7}{r}{\textit{Continued on next page}} \\
\midrule
\endfoot

\bottomrule
\endlastfoot
00007 & 0.95 & 0.98 & 0.89 & 0.81 & 0.94 & 0.83 \\
00016 & 0.89 & 0.96 & 0.87 & 0.78 & 0.92 & 0.80 \\
00019 & 0.87 & 0.97 & 0.86 & 0.80 & 0.95 & 0.83 \\
00020 & 0.76 & 0.96 & 0.79 & 0.72 & 0.92 & 0.80 \\
00022 & 0.94 & 0.98 & 0.90 & 0.84 & 0.94 & 0.83 \\
00044 & 0.96 & 0.99 & 0.93 & 0.74 & 0.93 & 0.80 \\
00054 & 0.95 & 0.99 & 0.92 & 0.83 & 0.97 & 0.86 \\
00055 & 0.93 & 0.99 & 0.90 & 0.71 & 0.96 & 0.81 \\
00078 & 0.88 & 0.98 & 0.89 & 0.86 & 0.98 & 0.85 \\
00117 & 0.95 & 0.99 & 0.94 & 0.83 & 0.96 & 0.80 \\
00125 & 0.85 & 0.97 & 0.86 & 0.80 & 0.94 & 0.83 \\
00133 & 0.86 & 0.96 & 0.90 & 0.70 & 0.94 & 0.82 \\
00138 & 0.93 & 0.99 & 0.90 & 0.79 & 0.95 & 0.84 \\
00144 & 0.90 & 0.98 & 0.90 & 0.83 & 0.95 & 0.85 \\
00153 & 0.96 & 0.99 & 0.93 & 0.85 & 0.95 & 0.85 \\
00154 & 0.87 & 0.98 & 0.88 & 0.77 & 0.94 & 0.83 \\
00162 & 0.86 & 0.99 & 0.87 & 0.85 & 0.99 & 0.86 \\
00163 & 0.86 & 0.98 & 0.86 & 0.80 & 0.97 & 0.85 \\
00164 & 0.92 & 0.99 & 0.90 & 0.77 & 0.95 & 0.80 \\
00169 & 0.89 & 0.99 & 0.88 & 0.73 & 0.95 & 0.80 \\
00172 & 0.91 & 1.00 & 0.93 & 0.81 & 0.98 & 0.84 \\
00173 & 0.85 & 0.98 & 0.83 & 0.83 & 0.97 & 0.85 \\
00176 & 0.91 & 0.99 & 0.91 & 0.80 & 0.98 & 0.86 \\
00177 & 0.85 & 0.98 & 0.88 & 0.78 & 0.97 & 0.85 \\
00179 & 0.85 & 0.97 & 0.88 & 0.83 & 0.98 & 0.87 \\
00189 & 0.91 & 0.99 & 0.89 & 0.80 & 0.97 & 0.84 \\
00192 & 0.97 & 1.00 & 0.93 & 0.77 & 0.97 & 0.86 \\
00198 & 0.84 & 0.98 & 0.87 & 0.69 & 0.96 & 0.80 \\
00206 & 0.93 & 0.98 & 0.91 & 0.78 & 0.92 & 0.77 \\
00239 & 0.77 & 0.95 & 0.84 & 0.72 & 0.93 & 0.77 \\
00247 & 0.90 & 0.97 & 0.91 & 0.72 & 0.85 & 0.79 \\
00256 & 0.91 & 0.99 & 0.93 & 0.75 & 0.96 & 0.84 \\
00258 & 0.94 & 1.00 & 0.90 & 0.82 & 0.98 & 0.84 \\
00297 & 0.95 & 1.00 & 0.93 & 0.78 & 0.97 & 0.84 \\
00361 & 0.95 & 1.00 & 0.93 & 0.76 & 0.95 & 0.85 \\
00364 & 0.78 & 0.96 & 0.82 & 0.69 & 0.95 & 0.78 \\
00368 & 0.85 & 0.98 & 0.86 & 0.75 & 0.95 & 0.81 \\
00373 & 0.89 & 0.98 & 0.88 & 0.78 & 0.97 & 0.84 \\
00382 & 0.90 & 0.99 & 0.91 & 0.82 & 0.98 & 0.85 \\
00396 & 0.91 & 0.99 & 0.90 & 0.79 & 0.97 & 0.86 \\
00407 & 0.91 & 0.99 & 0.90 & 0.79 & 0.96 & 0.85 \\
00416 & 0.94 & 0.99 & 0.93 & 0.81 & 0.97 & 0.86 \\
00431 & 0.96 & 1.00 & 0.93 & 0.83 & 0.98 & 0.88 \\
00506 & 0.94 & 0.99 & 0.91 & 0.75 & 0.94 & 0.83 \\
00560 & 0.94 & 0.99 & 0.93 & 0.83 & 0.96 & 0.87 \\
00561 & 0.89 & 0.98 & 0.89 & 0.75 & 0.96 & 0.81 \\
00570 & 0.91 & 0.98 & 0.91 & 0.80 & 0.96 & 0.86 \\
00571 & 0.95 & 0.99 & 0.91 & 0.83 & 0.98 & 0.86 \\
00575 & 0.95 & 1.00 & 0.92 & 0.75 & 0.96 & 0.82 \\
00583 & 0.95 & 0.99 & 0.91 & 0.83 & 0.96 & 0.87 \\
00589 & 0.94 & 1.00 & 0.93 & 0.74 & 0.97 & 0.82 \\
00600 & 0.98 & 1.00 & 0.94 & 0.83 & 0.96 & 0.87 \\
00702 & 0.86 & 0.99 & 0.88 & 0.84 & 0.98 & 0.87 \\
00717 & 0.96 & 1.00 & 0.93 & 0.84 & 0.97 & 0.85 \\
00729 & 0.93 & 0.99 & 0.91 & 0.79 & 0.96 & 0.83 \\
00731 & 0.94 & 1.00 & 0.92 & 0.78 & 0.96 & 0.83 \\
00732 & 0.95 & 0.99 & 0.93 & 0.84 & 0.96 & 0.86 \\
00733 & 0.94 & 1.00 & 0.89 & 0.81 & 0.97 & 0.85 \\
00742 & 0.86 & 0.98 & 0.90 & 0.75 & 0.96 & 0.82 \\
00758 & 0.93 & 0.98 & 0.91 & 0.79 & 0.86 & 0.82 \\
00773 & 0.96 & 0.99 & 0.92 & 0.83 & 0.92 & 0.83 \\
00784 & 0.79 & 0.86 & 0.77 & 0.75 & 0.83 & 0.78 \\
00796 & 0.82 & 0.95 & 0.82 & 0.76 & 0.94 & 0.79 \\
00797 & 0.94 & 1.00 & 0.91 & 0.79 & 0.96 & 0.82 \\
00799 & 0.97 & 0.98 & 0.94 & 0.84 & 0.89 & 0.83 \\
00803 & 0.91 & 0.96 & 0.88 & NaN & NaN & NaN \\
00804 & 0.94 & 0.98 & 0.91 & 0.80 & 0.87 & 0.82 \\
00819 & 0.95 & 0.99 & 0.93 & 0.81 & 0.96 & 0.84 \\
00821 & 0.90 & 0.97 & 0.89 & 0.73 & 0.92 & 0.78 \\
00823 & 0.95 & 1.00 & 0.94 & 0.84 & 0.97 & 0.87 \\
00830 & 0.94 & 0.99 & 0.93 & 0.82 & 0.97 & 0.86 \\
00850 & 0.85 & 0.97 & 0.87 & 0.81 & 0.96 & 0.85 \\
00854 & 0.92 & 0.99 & 0.91 & 0.82 & 0.96 & 0.87 \\
00859 & 0.89 & 0.98 & 0.88 & 0.75 & 0.94 & 0.81 \\
00863 & 0.93 & 0.99 & 0.93 & 0.72 & 0.93 & 0.77 \\
00866 & 0.89 & 0.98 & 0.88 & 0.81 & 0.96 & 0.84 \\
00874 & 0.95 & 0.99 & 0.94 & 0.79 & 0.95 & 0.85 \\
00882 & 0.94 & 0.99 & 0.92 & 0.81 & 0.94 & 0.87 \\
00884 & 0.91 & 0.97 & 0.90 & 0.82 & 0.95 & 0.85 \\
00890 & 0.86 & 0.97 & 0.86 & 0.78 & 0.91 & 0.80 \\
00891 & 0.91 & 0.97 & 0.89 & 0.73 & 0.92 & 0.82 \\
00892 & 0.96 & 1.00 & 0.92 & 0.78 & 0.94 & 0.80 \\
00907 & 0.91 & 0.98 & 0.90 & 0.75 & 0.94 & 0.81 \\
00912 & 0.93 & 1.00 & 0.92 & 0.86 & 0.98 & 0.88 \\
00935 & 0.93 & 0.99 & 0.91 & 0.81 & 0.95 & 0.85 \\
00936 & 0.96 & 1.00 & 0.94 & 0.86 & 0.98 & 0.86 \\
00943 & 0.90 & 0.98 & 0.89 & 0.71 & 0.94 & 0.79 \\
00952 & 0.92 & 0.99 & 0.89 & 0.81 & 0.96 & 0.88 \\
00957 & 0.92 & 0.98 & 0.92 & 0.79 & 0.95 & 0.84 \\
00959 & 0.87 & 0.96 & 0.87 & 0.72 & 0.89 & 0.78 \\
00982 & 0.69 & 0.89 & 0.77 & 0.62 & 0.86 & 0.67 \\
00983 & 0.89 & 0.96 & 0.87 & 0.81 & 0.95 & 0.86 \\
01114 & 0.93 & 1.00 & 0.89 & 0.81 & 0.97 & 0.83 \\
01116 & 0.85 & 0.97 & 0.81 & 0.77 & 0.95 & 0.81 \\
01131 & 0.92 & 0.99 & 0.89 & 0.77 & 0.97 & 0.82 \\
01133 & 0.91 & 0.99 & 0.89 & 0.75 & 0.96 & 0.81 \\
01138 & 0.95 & 1.00 & 0.92 & 0.80 & 0.97 & 0.86 \\
01180 & 0.71 & 0.88 & 0.71 & 0.71 & 0.91 & 0.70 \\
01191 & 0.89 & 0.99 & 0.89 & 0.82 & 0.97 & 0.86 \\
01193 & 0.84 & 0.98 & 0.87 & 0.79 & 0.96 & 0.85 \\
01207 & 0.87 & 0.98 & 0.85 & 0.79 & 0.96 & 0.84 \\
01231 & 0.88 & 0.97 & 0.88 & 0.73 & 0.91 & 0.77 \\
01258 & 0.91 & 0.99 & 0.89 & 0.76 & 0.96 & 0.82 \\
01282 & 0.88 & 0.99 & 0.87 & 0.81 & 0.97 & 0.82 \\
01324 & 0.89 & 0.95 & 0.84 & 0.79 & 0.91 & 0.78 \\
01365 & 0.96 & 1.00 & 0.93 & 0.86 & 0.97 & 0.86 \\
01373 & 0.96 & 0.99 & 0.94 & 0.85 & 0.94 & 0.88 \\
01374 & 0.97 & 1.00 & 0.93 & 0.86 & 0.97 & 0.84 \\
01381 & 0.95 & 0.98 & 0.92 & 0.80 & 0.88 & 0.85 \\
01382 & 0.96 & 0.99 & 0.94 & 0.79 & 0.93 & 0.81 \\
01390 & 0.95 & 0.98 & 0.92 & 0.79 & 0.92 & 0.81 \\
01392 & 0.83 & 0.93 & 0.83 & 0.76 & 0.92 & 0.81 \\
01395 & 0.95 & 0.99 & 0.94 & 0.82 & 0.97 & 0.84 \\
01475 & 0.92 & 0.99 & 0.92 & 0.74 & 0.96 & 0.85 \\
01486 & 0.91 & 0.98 & 0.90 & 0.82 & 0.97 & 0.85 \\
01502 & 0.85 & 0.94 & 0.84 & 0.79 & 0.89 & 0.84 \\
01505 & 0.96 & 1.00 & 0.93 & 0.86 & 0.98 & 0.87 \\
01506 & 0.96 & 0.99 & 0.93 & 0.77 & 0.91 & 0.83 \\
01515 & 0.97 & 1.00 & 0.92 & 0.84 & 0.98 & 0.85 \\
01516 & 0.92 & 0.98 & 0.91 & 0.84 & 0.95 & 0.84 \\
01518 & 0.92 & 0.99 & 0.91 & 0.81 & 0.97 & 0.88 \\
01534 & 0.96 & 0.99 & 0.92 & 0.77 & 0.91 & 0.80 \\
01535 & 0.92 & 0.99 & 0.89 & 0.78 & 0.97 & 0.84 \\
01601 & 0.82 & 0.92 & 0.87 & 0.76 & 0.91 & 0.82 \\
01602 & 0.93 & 0.99 & 0.88 & 0.86 & 0.98 & 0.85 \\
01610 & 0.93 & 0.97 & 0.92 & 0.77 & 0.90 & 0.77 \\
01624 & 0.96 & 0.99 & 0.92 & 0.81 & 0.92 & 0.83 \\
01627 & 0.87 & 0.98 & 0.89 & 0.67 & 0.94 & 0.81 \\
01640 & 0.96 & 0.99 & 0.93 & 0.84 & 0.92 & 0.85 \\
01659 & 0.97 & 1.00 & 0.93 & 0.82 & 0.95 & 0.86 \\
01667 & 0.95 & 0.99 & 0.92 & 0.81 & 0.95 & 0.85 \\
01679 & 0.96 & 0.99 & 0.92 & 0.86 & 0.95 & 0.86 \\
01699 & 0.90 & 0.94 & 0.87 & 0.80 & 0.92 & 0.82 \\
01706 & 0.88 & 0.98 & 0.88 & 0.68 & 0.95 & 0.80 \\
01710 & 0.95 & 0.99 & 0.93 & 0.80 & 0.96 & 0.86 \\
01711 & 0.97 & 1.00 & 0.93 & 0.80 & 0.88 & 0.84 \\
01712 & 0.86 & 0.97 & 0.87 & 0.69 & 0.91 & 0.76 \\
01714 & 0.90 & 0.96 & 0.89 & 0.85 & 0.94 & 0.85 \\
01715 & 0.87 & 0.98 & 0.90 & 0.77 & 0.96 & 0.83 \\
01718 & 0.96 & 0.99 & 0.95 & 0.77 & 0.92 & 0.78 \\
01724 & 0.92 & 0.99 & 0.92 & 0.83 & 0.98 & 0.84 \\
01747 & 0.96 & 0.99 & 0.90 & 0.81 & 0.94 & 0.84 \\
01801 & 0.91 & 0.96 & 0.89 & 0.88 & 0.94 & 0.87 \\
01812 & 0.97 & 1.00 & 0.93 & 0.79 & 0.98 & 0.83 \\
01813 & 0.95 & 1.00 & 0.90 & 0.75 & 0.97 & 0.81 \\
01818 & 0.90 & 0.98 & 0.91 & 0.80 & 0.97 & 0.83 \\
01842 & 0.88 & 0.97 & 0.86 & 0.76 & 0.96 & 0.80 \\
01858 & 0.89 & 0.99 & 0.89 & 0.76 & 0.98 & 0.83 \\
01866 & 0.91 & 0.98 & 0.90 & 0.78 & 0.97 & 0.83 \\
01867 & 0.87 & 0.99 & 0.92 & 0.79 & 0.97 & 0.86 \\
01875 & 0.87 & 0.99 & 0.90 & 0.75 & 0.98 & 0.83 \\
01879 & 0.95 & 0.99 & 0.93 & 0.83 & 0.98 & 0.85 \\
01892 & 0.84 & 0.97 & 0.87 & 0.81 & 0.96 & 0.86 \\
01893 & 0.88 & 0.98 & 0.89 & 0.68 & 0.96 & 0.78 \\
01918 & 0.62 & 0.90 & 0.74 & 0.63 & 0.91 & 0.77 \\
01925 & 0.91 & 0.99 & 0.88 & 0.80 & 0.96 & 0.87 \\
01928 & 0.90 & 0.99 & 0.88 & 0.84 & 0.98 & 0.85 \\
01948 & 0.95 & 1.00 & 0.92 & 0.82 & 0.97 & 0.84 \\
01950 & 0.95 & 0.99 & 0.92 & 0.89 & 0.97 & 0.87 \\
01970 & 0.87 & 0.99 & 0.86 & 0.79 & 0.98 & 0.85 \\
\end{longtable}
}

Representative qualitative examples of high- and lower-performing cases are shown in Fig.~\ref{fig:labeling_examples_appendix}.

\begin{figure}[ht]
    \centering
    \includegraphics[width=1.2\linewidth, angle=90]{Figures/Results/2 Labeling result visualization.PNG}
    \caption{Representative anatomical labeling results for two PTL test cases. Case~1 demonstrates relatively high scores; Case~2 shows lower scores with mismatches in peripheral branches. All vessels are colored by anatomical class.}
    \label{fig:labeling_examples_appendix}
\end{figure}


\subsection{Qualitative Evaluation (PTL Test Set)}

The per-scan qualitative expert ratings for the PTL test set are summarized in Table~\ref{tab:ptl-qual}.


\begin{table}
\centering
\caption{Per-scan qualitative expert ratings on the PTL test set.}
\label{tab:ptl-qual}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lllllllll}
\toprule
Scan & Correctness of Proximal vs. Distal Labeling (Artery) & Label Consistency Across Branches (Artery) & Mean Score (Artery) & Usefulness for Clinical Interpretation (Artery) & Correctness of Proximal vs. Distal Labeling (Vein) & Label Consistency Across Branches (Vein) & Mean Score (Vein) & Usefulness for Clinical Interpretation (Vein) \\
\midrule
00016 & 3 & 3 & 3.33 & 4 & 4 & 3 & 3.67 & 4 \\
00054 & 3 & 2 & 2.67 & 3 & 4 & 4 & 4.00 & 4 \\
00055 & 2 & 4 & 3.33 & 4 & 2 & 2 & 2.33 & 3 \\
00078 & 4 & 4 & 4.00 & 4 & 3 & 3 & 3.33 & 4 \\
00138 & 3 & 2 & 2.67 & 3 & 3 & 4 & 3.67 & 4 \\
00176 & 3 & 4 & 3.67 & 4 & 4 & 4 & 3.67 & 3 \\
00177 & 3 & 3 & 3.00 & 3 & 4 & 4 & 4.00 & 4 \\
00192 & 4 & 2 & 3.33 & 4 & 3 & 3 & 3.33 & 4 \\
00206 & 3 & 3 & 3.00 & 3 & 3 & 2 & 2.67 & 3 \\
00297 & 4 & 3 & 3.67 & 4 & 2 & 4 & 3.33 & 4 \\
00364 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 & 4 \\
00396 & 2 & 2 & 2.33 & 3 & 3 & 2 & 2.67 & 3 \\
00407 & 3 & 3 & 3.33 & 4 & 3 & 4 & 3.67 & 4 \\
00560 & 3 & 4 & 3.67 & 4 & 3 & 3 & 3.00 & 3 \\
00561 & 4 & 4 & 3.67 & 3 & 4 & 2 & 3.33 & 4 \\
00575 & 4 & 4 & 4.00 & 4 & 3 & 3 & 3.00 & 3 \\
00589 & 3 & 3 & 3.33 & 4 & 3 & 3 & 3.00 & 3 \\
00731 & 3 & 2 & 2.67 & 3 & 4 & 3 & 3.67 & 4 \\
00732 & 4 & 3 & 3.67 & 4 & 4 & 4 & 4.00 & 4 \\
00733 & 4 & 4 & 4.00 & 4 & 2 & 2 & 2.33 & 3 \\
00758 & 3 & 3 & 3.33 & 4 & 3 & 3 & 3.33 & 4 \\
\bottomrule
\end{tabular}
}
\end{table}

% A summary of the score distributions across categories is given in Fig.~\ref{fig:qualitative_labeling_appendix}.

% \begin{figure}[ht]
% \centering
% \includegraphics[width=\linewidth]{Figures/Results/Qualitative Labeling Evaluation.png}
% \caption{Qualitative evaluation of anatomical labeling on 21 PTL test scans. Bars show mean scores for arteries and veins in three categories; error bars represent standard deviation.}
% \label{fig:qualitative_labeling_appendix}
% \end{figure}


\subsubsection{Comparison of Quantitative and Qualitative Evaluation}

The correlation between anatomical labeling metrics and qualitative expert scores on the PTL test set is reported in Table~\ref{tab:ptl-corr}.


\begin{table}
\centering
\caption{Correlation between anatomical labeling metrics and qualitative expert scores on the PTL test set.}
\label{tab:ptl-corr}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lllrl}
\toprule
Metric & Structure & ExpertCategory & Correlation & P-value \\
\midrule
Voxel & Artery & Label Consistency Across Branches & -0.26 & 2.62e-01 \\
Voxel & Artery & Correctness of Proximal vs. Distal Labeling & 0.03 & 9.03e-01 \\
Voxel & Artery & Usefulness for Clinical Interpretation & 0.18 & 4.26e-01 \\
Voxel & Artery & Mean Score & -0.06 & 7.83e-01 \\
Voxel & Vein & Label Consistency Across Branches & 0.38 & 9.20e-02 \\
Voxel & Vein & Correctness of Proximal vs. Distal Labeling & 0.09 & 6.86e-01 \\
Voxel & Vein & Usefulness for Clinical Interpretation & 0.02 & 9.44e-01 \\
Voxel & Vein & Mean Score & 0.24 & 2.96e-01 \\
Node & Artery & Label Consistency Across Branches & -0.16 & 4.96e-01 \\
Node & Artery & Correctness of Proximal vs. Distal Labeling & 0.18 & 4.26e-01 \\
Node & Artery & Usefulness for Clinical Interpretation & 0.15 & 5.16e-01 \\
Node & Artery & Mean Score & 0.04 & 8.48e-01 \\
Node & Vein & Label Consistency Across Branches & 0.08 & 7.43e-01 \\
Node & Vein & Correctness of Proximal vs. Distal Labeling & -0.09 & 6.86e-01 \\
Node & Vein & Usefulness for Clinical Interpretation & -0.15 & 5.28e-01 \\
Node & Vein & Mean Score & -0.08 & 7.17e-01 \\
Edge & Artery & Label Consistency Across Branches & -0.32 & 1.61e-01 \\
Edge & Artery & Correctness of Proximal vs. Distal Labeling & 0.18 & 4.24e-01 \\
Edge & Artery & Usefulness for Clinical Interpretation & 0.13 & 5.64e-01 \\
Edge & Artery & Mean Score & -0.05 & 8.28e-01 \\
Edge & Vein & Label Consistency Across Branches & 0.27 & 2.41e-01 \\
Edge & Vein & Correctness of Proximal vs. Distal Labeling & 0.02 & 9.17e-01 \\
Edge & Vein & Usefulness for Clinical Interpretation & 0.11 & 6.25e-01 \\
Edge & Vein & Mean Score & 0.19 & 4.03e-01 \\
\bottomrule
\end{tabular}
}
\end{table}

These relationships are visualized in Fig.~\ref{fig:labeling_correlation_appendix} and Fig.~\ref{fig:label_scatter_voxel_vs_expert_appendix}.

\begin{figure}[ht]
    \centering
    \includegraphics[width=\linewidth]{Figures/Results/labeling_correlation_heatmap.pdf}
    \caption{Spearman correlation between quantitative anatomical labeling metrics and expert-assigned scores across 21 PTL test scans, for arteries (left) and veins (right).}
    \label{fig:labeling_correlation_appendix}
\end{figure}

\begin{figure}[ht]
  \centering
  \includegraphics[width=0.85\linewidth]{Figures/Results/labeling_voxel_vs_expert_mean.pdf}
  \caption{Case-level relationship between voxel-level labeling Dice and the expert mean score for arteries and veins.}
  \label{fig:label_scatter_voxel_vs_expert_appendix}
\end{figure}



\subsection{Clinical Viability}

\subsubsection{Qualitative Evaluation (In-House Dataset)}

The per-scan qualitative ratings of clinical viability on the in-house longitudinal dataset are summarized in Table~\ref{tab:clinical-viability}.


\begin{table}
\centering
\caption{Qualitative evaluation of clinical viability on the in-house longitudinal dataset.}
\label{tab:clinical-viability}
\resizebox{\textwidth}{!}{%
\begin{tabular}{lllllllll}
\toprule
Scan & Anatomical Completeness and Accuracy (Artery) & Clinical Utility (Artery) & Consistency and Plausibility of Labeling (Artery) & Mean Score (Artery) & Anatomical Completeness and Accuracy (Vein) & Clinical Utility (Vein) & Consistency and Plausibility of Labeling (Vein) & Mean Score (Vein) \\
\midrule
P54-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 3 & 3.67 \\
P54-FU1-3M-SCAN & 4 & 4 & 3 & 3.67 & 4 & 4 & 4 & 4.00 \\
P54-FU2-7M-SCAN & 2 & 4 & 3 & 3.00 & 2 & 4 & 3 & 3.00 \\
P56-Baseline-1.2m-SCAN & 3 & 4 & 4 & 3.67 & 3 & 4 & 4 & 3.67 \\
P56-FU1-3.0m-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 4 & 3.67 \\
P56-FU2-13.4m-SCAN & 2 & 4 & 2 & 2.67 & 3 & 4 & 2 & 3.00 \\
P56-FU3-22.7m-SCAN & 2 & 3 & 2 & 2.33 & 3 & 4 & 2 & 3.00 \\
P60-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 3 & 4 & 3.67 \\
P60-FU1-3.2-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P60-FU1-9.4-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 3 & 3.67 \\
P60-FU2-9.4-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P60-FU2-22.8-SCAN & 3 & 3 & 3 & 3.00 & 3 & 4 & 4 & 3.67 \\
P76 Baseline-1.3m-SCAN & 4 & 4 & 3 & 3.67 & 4 & 3 & 3 & 3.33 \\
P76-FU 27.2m-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P76-FU1-3.0m-SCAN & 4 & 4 & 3 & 3.67 & 4 & 4 & 3 & 3.67 \\
P76-FU2-18.1m-SCAN & 3 & 3 & 3 & 3.00 & 3 & 4 & 3 & 3.33 \\
P80-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P80-FU1-SCAN-11m & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P80-FU2-SCAN-25m & 2 & 4 & 3 & 3.00 & 2 & 4 & 3 & 3.00 \\
P82-BASELINE-SCAN & 4 & 4 & 3 & 3.67 & 4 & 4 & 3 & 3.67 \\
P82-FU1-SCAN-23m & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P82-FU2-SCAN-43m & 2 & 4 & 3 & 3.00 & 2 & 4 & 3 & 3.00 \\
P87-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 3 & 4 & 3.67 \\
P87-FU1-4M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P87-FU2-14M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P87-FU3-25M-SCAN & 3 & 3 & 3 & 3.00 & 3 & 4 & 3 & 3.33 \\
P87-FU4-40M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 4 & 3.67 \\
P96-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P96-FU1-5M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P96-FU2-12M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P96-FU3-18M-SCAN & 3 & 4 & 4 & 3.67 & 3 & 4 & 4 & 3.67 \\
P96-FU4-24M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 4 & 3.67 \\
P104-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P104-FU1-5M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P104-FU2-15M-SCAN & 3 & 4 & 4 & 3.67 & 3 & 4 & 4 & 3.67 \\
P104-FU3-27M-SCAN & 2 & 4 & 2 & 2.67 & 4 & 4 & 3 & 3.67 \\
P104-FU4-41M-SCAN & 2 & 3 & 2 & 2.33 & 2 & 3 & 3 & 2.67 \\
P105-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P105-FU1-6M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P105-FU2-12M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P105-FU3-18M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P106-BASELINE-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 2 & 3.00 \\
P106-FU1-6M-SCAN & 3 & 4 & 3 & 3.33 & 3 & 4 & 2 & 3.00 \\
P106-FU2-11M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P107-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P107-FU1-SCAN-9m & 3 & 3 & 3 & 3.00 & 3 & 3 & 4 & 3.33 \\
P107-FU2-SCAN-26m & 3 & 3 & 3 & 3.00 & 3 & 3 & 4 & 3.33 \\
P109-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P109-FU1-SCAN-19m & 3 & 4 & 3 & 3.33 & 3 & 4 & 3 & 3.33 \\
P113-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P113-FU1-SCAN-4M & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P113-FU2-SCAN-10M & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P113-FU3-SCAN-17M & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P136-BASELINE-SCAN & 3 & 4 & 4 & 3.67 & 3 & 4 & 4 & 3.67 \\
P136-FU1-9M-SCAN & 3 & 4 & 4 & 3.67 & 4 & 4 & 3 & 3.67 \\
P136-FU2-21M-SCAN & 2 & 4 & 4 & 3.33 & 4 & 4 & 4 & 4.00 \\
P138-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 3 & 4 & 4 & 3.67 \\
P138-FU1-2M-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 3 & 3.67 \\
P138-FU2-9M-SCAN & 5 & 4 & 4 & 4.33 & 4 & 4 & 4 & 4.00 \\
P144-BASELINE-SCAN & 4 & 4 & 4 & 4.00 & 4 & 4 & 4 & 4.00 \\
P144-FU1-13.9M-SCAN & 3 & 4 & 4 & 3.67 & 3 & 4 & 3 & 3.33 \\
P144-FU1-23M-SCAN & 3 & 4 & 4 & 3.67 & 4 & 4 & 4 & 4.00 \\
P144-FU1-3.2M-SCAN & 4 & 4 & 4 & 4.00 & 3 & 4 & 4 & 3.67 \\
\bottomrule
\end{tabular}
}
\end{table}

% The aggregated score distribution across all 63 scans is shown in Fig.~\ref{fig:clinical_evaluation_inhouse_appendix}.

% \begin{figure}[ht]
%     \centering
%     \includegraphics[width=\linewidth]{Figures/Results/Qualitative Clinical Data Evaluation.png}
%     \caption{Qualitative clinical evaluation of full pipeline results on 63 in-house CT scans. A clinical expert rated arteries and veins separately across three categories and an overall mean score.}
%     \label{fig:clinical_evaluation_inhouse_appendix}
% \end{figure}

