\section{Experimental Setup}\label{sec:experimental_setup}
\paragraph{Preprocessing Pipeline}
We developed a preprocessing pipeline designed for vertebra-level classification tasks in CT scans. Initially, a spine segmentation model is applied to identify individual vertebrae. Utilizing the segmentation mask, we then extract $96 \times 96 \times 96$ crops centered on the vertebra of interest.\footnote{Segmentation model is provided by the ImFusion GmbH (approach similar to \citet{burgin2023robust})} The preprocessing pipeline is illustrated in greater detail in \figureref{fig:preprocessing_pipeline} and \figureref{fig:3d_vertebra_crop_visualization}.

\paragraph{Unlabeled Vertebra Pretraining Dataset}
In our research, we do task-specific self-supervised domain adaptation pretraining by utilizing a big unlabeled vertebra dataset. This dataset was created by collecting seven publicly available CT datasets, each containing spine segments, inspired by the dataset selection in the CTSpine1K dataset \citep{deng2021ctspine1k}. We subsequently employed the preprocessing pipeline to process these datasets. This approach resulted in a dataset comprising 27,776 individual unlabeled vertebrae extracted from 3,446 different CT volumes. It's essential to note that this preprocessing step relies on a segmentation model, making the process unsupervised. Detailed information about the unlabeled pretraining dataset is summarized in \tableref{tab:unlabeled_dataset}.

% Table Unlabeled Dataset
\begin{table}[ht]
\centering
\small
\begin{tabular}{|c|c|c|}
\hline
\bfseries Dataset & \bfseries Patients & \bfseries Vertebrae \\
\hline
\hline
CT COLONOGRAPHY \citep{smith2015data} & 784 & 6,515 \\
\hline
COVID-19 \citep{an2020ct} & 650 & 8,425 \\
\hline
MSD-Liver \citep{simpson2019large} & 201 & 2,297 \\
\hline
HNSCC-3DCT-RT \citep{bejarano2018head} & 31 & 296 \\
\hline
DeepLesion \citep{yan2018deeplesion} & 1,107 & 2,820 \\
\hline
KiTS21 \citep{heller2021state} & 300 & 2,727 \\
\hline
VerSe \citep{sekuboyina2021verse} & 373 & 4,696 \\
\hline
\textbf{TOTAL} & \textbf{3,446} & \textbf{27,776} \\
\hline
\end{tabular}
\caption{Unlabeled Vertebra Pretraining Dataset}
\label{tab:unlabeled_dataset}
\end{table}

\paragraph{Labeled Vertebra Downstream Task Dataset}
To avoid test-leakage between pretraining and downstream classification task finetuning, we strictly separated patients between labeled and unlabeled datasets. For the labeled vertebra dataset we use an in-house dataset from Klinikum Rechts der Isar (Munich) \citep{foreman2024deep} consisting of 6,245 vertebrae (940 of which are fractured) from 457 different patients. More details about the labeled dataset are provided in \tableref{tab:labeled_vertebra_dataset}.