% This is samplepaper.tex, a sample chapter demonstrating the
% LLNCS macro package for Springer Computer Science proceedings;
% Version 2.21 of 2022/01/12
%
\documentclass[runningheads]{llncs}
%
\usepackage[T1]{fontenc}
% T1 fonts will be used to generate the final print and online PDFs,
% so lease use T1 fonts in your manuscript whenever possible.
% Other font encondings may result in incorrect characters.
%
\usepackage{orcidlink}
\usepackage{multirow}
\usepackage{graphicx}
\usepackage{booktabs} 
% Used for displaying a sample figure. If possible, figure files    should
% be included in EPS format.
%
% If you use the hyperref package, please uncomment the following two lines
% to display URLs in blue roman font according to Springer's eBook style:
%\usepackage{color}
%\renewcommand\UrlFont{\color{blue}\rmfamily}
%\usepackage[pagebackref=true,breaklinks=true,colorlinks,bookmarks=false]{hyperref}
%
\begin{document}

\title{nnUNet for Semi-supervised Tooth and Pulp Root Canal Segmentation in CBCT}
%
\titlerunning{Tooth and Pulp Root Canal Segmentation}
% If the paper title is too long for the running head, you can set
% an abbreviated paper title here
%
\author{Ajo Babu George\inst{1}\orcidID{0009-0005-3026-0959} \and
Sadhvik Bathini\inst{2}\orcidID{0009-0007-1011-3761} 
} 
%
\authorrunning{George and Bathini}
% First names are abbreviated in the running head.
% If there are more than two authors, 'et al.' is used.
%
\institute{DiceMed, Odisha, India.\and Indian Institute of Technology Kharagpur, West Bengal, India\\
\email{drajo\_george@dicemed.in}, \email{sadhvik.ini@gmail.com}}
%
\maketitle              % typeset the header of the contribution
%
\begin{abstract}
% The abstract should briefly summarize the main contribution of the paper and the validation performance. 
% (150--250 words)
% \\
% The total length of the manuscript should be at \textbf{least 8 pages (don't include references)}. There is no limitation for the maximum number of pages. 
% \\
% Latex tutorial:
% \url{https://www.overleaf.com/learn/latex/Learn_LaTeX_in_30_minutes}
A solution for the Semi-supervised Teeth Segmentation and Registration (STSR) 2025 Challenge, which focused on the precise segmentation of teeth and pulp root canals in 3D Cone Beam Computed Tomography (CBCT) scans is presented in this paper. Accurate segmentation of the pulp root canal is crucial for clinical visualization and treatment planning, but manual annotation is extremely labor-intensive. The presented approach uses a semi-supervised framework powered by nnU-Net, leveraging a small labeled dataset of 30 scans alongside a much larger unlabeled dataset of 300 scans. To effectively utilize the unlabeled data, pseudo-labeling was employed to generate annotations, and the model was subsequently trained. The results for both tooth and pulp structures yield a Dice score of 0.8088 and an mIoU of 0.9638 in the all-data track, while the Dice score in the coreset track is 0.69. These metrics highlight the model's ability to accurately identify and delineate the target structures.

\keywords{Teeth Segmentation \and Semi-supervised learning \and nnUNet \and CBCT \and Pulp \and Root canals}
\end{abstract}



\section{Introduction}
% The introduction should have at least three parts to introduce the background, related work and your contributions. For each part, you can write multiple paragraphs to clarify your motivations and ideas. 

% P1. Introduce the background and difficulty of this challenge
The field of dentistry is increasingly benefiting from computer-aided diagnosis tools, particularly for treatment planning and prognosis evaluation. Precise segmentation of teeth and especially the root pulp canal from 3D Cone-Beam Computed Tomography (CBCT) scans is a crucial pre-processing step for many of these applications. This enables clearer visualization of dental anatomy, which in turn helps in developing more refined treatment strategies. However, manual annotation of these regions is an extremely labor-intensive task, requiring a substantial investment of time and human resources. This makes acquiring large, labeled datasets a significant challenge for training robust deep learning models.

% P2. Related work/state-of-the-art methods

Recent developments in deep learning have shown great promise in dental image analysis, with models capable of high-accuracy segmentation and disease classification\cite{chen2022recent} \cite{george2025grad}. Networks like U-Net and its derivatives, including nnU-Net, have become standard for medical image segmentation tasks \cite{azad2024medical}. nnU-Net, in particular, stands out for its ability to automatically adapt to various 3D and 2D medical imaging tasks without extensive manual tuning of hyperparameters \cite{pettit2022nnu}.

Despite these advances, a major limitation common to all deep learning-based methods is their reliance on a large quantity of high-quality training data, which is difficult and expensive to obtain for medical imaging. The manual annotation of 3D volume data, for example, requires experts to label each 2D slice, making the process even more challenging. To address this, semi-supervised learning has emerged as a highly practical approach, allowing models to benefit from a large quantity of readily available unlabeled data alongside a small set of labeled data \cite{han2024deep}.

This is the very purpose behind the Semi-supervised Teeth Segmentation (STS) Challenge. The challenge, a pioneering event in tooth segmentation, aimed to stimulate the development of effective semi-supervised algorithms for both 2D PXI and 3D CBCT volumes. The STS 2023 Challenge ~\cite{Wang2024SemiSupervised}~\cite{Wang2025} ~\cite{Wang2024STSM2} focused on tooth instance segmentation. A significant research gap remains in the precise, automated segmentation of the intricate pulp root canal. This task is more complex due to the high variability in pulp canal morphology and the need for fine-grained annotation consistency.

% related challenge: STS MICCAI 2023 Challenge~\cite{Wang2024SemiSupervised}~\cite{Wang2025} ~\cite{Wang2024STSM2}

% P3. Your motivation and solution/contribution. 
A primary motivation is to advance the field of dental image analysis by developing a robust methodology for segmenting both teeth and pulp root canals using a semi-supervised learning strategy. Public dental imaging resources primarily focus on 2D panoramic radiographs rather than volumetric CBCT, especially for pediatric and mixed‑dentition cohorts, highlighting the scarcity of large annotated 3D datasets and motivating semi‑supervised learning for this task \cite{zhang2023children}. This approach is essential because the manual annotation of pulp canals is particularly labor-intensive due to their complex and variable morphology. The challenge dataset provided a scenario typical of real clinical settings: a limited number of labeled scans (30) and a large pool of unlabeled data (300). The following are the contibutions:

\begin{itemize}
    \item A pseudo-labeling technique using the nnU-Net framework was adopted to leverage the large-scale unlabeled dataset. This approach allows a model to learn from a much larger volume of data, improving its robustness and generalization capabilities.
    \item The pipeline includes comprehensive pre-processing steps, such as resampling, cropping, and normalization.
    \item By combining these techniques, a model achieved robust performance across both teeth and pulp segmentation tasks, demonstrating its scalability and potential for real-world clinical applications where annotated data is scarce.
\end{itemize}

\section{Method}
% A detailed description of the method used and a figure to show your pipeline.
The methodology employs a semi-supervised learning strategy to address the challenge of limited labeled data. Figure \ref{fig:overview} describes the overall pipeline involves pre-processing the 3D CBCT scans, followed by a pseudo-labeling technique and extensive model training.

\begin{figure}
    \centering
    \includegraphics[width=1\linewidth]{imgs/overview.pdf}
    \caption{Overall Pipeline of the proposed methodology.}
    \label{fig:overview}
\end{figure}

\subsection{nnU-Net Architecture}

The nnU-Net framework \cite{nnunet} is employed for the network architecture, a self-contained python package that automatically optimizes network structure and training strategies for medical image segmentation. The architecture has a characteristic U-shape, where 3D arrays representing each input image undergo a series of convolutions, maximum pooling, up-convolutions, and concatenation steps. The initial half of the network is responsible for feature extraction, while the second half synthesizes the segmentation output. The design includes horizontal concatenation steps that pass early network information to later stages, a key feature of U-Net-based architectures.

% \subsection{Model component 1: e.g., Network Architecture}
% \textbf{Please provide figures to show your pipeline or network architecture.} 
% Figure~\ref{fig:Network} shows a typical example of 3D U-Net


% \begin{figure}[htbp]
% \centering
% \includegraphics[scale=0.35]{imgs/U-Net.png}
% \caption{Network architecture (Copyright preserved. Please do not directly use this figure in your manuscript.) Please also include the network description in the figure title. So reviewers could quickly understand your idea. 
% }
% \label{fig:Network}
% \end{figure}

% Explain network architecture details.


% \subsection{Model component 2: e.g., Prompt Encoder and Interaction Simulation}

% Please explain: 

% How to encode the box prompts (if used) and point prompts?

% How to simulate the prompt during training?



% \subsection{Model component 3: e.g., Decoder}
% Please introduce decoder 

% Please also introduce loss function: 
% we use the summation between Dice loss and focal loss because compound loss functions have been proven to be robust in various medical image segmentation tasks


% Please also introduce your strategies to handle 3D large input images


% \subsection{Post-processing (if available, otherwise delete this subsection) }
% Description of post-processing of the model outputs to get the final output in the inference stage.

% Any strategies to speed up the inference



\section{Experiments}
\subsection{Dataset and evaluation metrics} 
% to save space, no need to provide references for each dataset in each paper. We'll provide a detailed reference list for the whole proceedings. 
The dataset for this task consists of 3D CBCT images for semi-supervised segmentation of teeth and pulp root canals. The training dataset consists of two parts: a labeled set of 30 images with fine-grained segmentation masks for teeth, wisdom teeth, and various root canal structures, and a much larger unlabeled set of 300 images. For public validation, 40 labeled images are provided, with segmentation results submitted to the Codabench platform for evaluation.

% For task 1, the evaluation metrics include Dice Similarity Coefficient (DSC), Normalized Surface Distance (NSD), Mean Intersection over Union (mIoU), and Identification Accuracy (IA) to evaluate the segmentation region overlap and boundary distance. 



% For task 2, the evaluation metrics include mean translation error and mean rotation error.

% In addition, the algorithm runtime and memory consumption will be considered as part of the grade. 



\subsection{Implementation details}
%###########################
\subsubsection{Preprocessing}\label{preprocess}
The images are first resampled and cropped to a default size that the nnUNet model configures while maintaining key anatomical features. A per-scan z-score normalization is applied to each image, standardizing the data with a mean of 0 and a standard deviation of 1 across the non-zero voxels. 



\subsubsection{Environment settings}
The development environments and requirements are presented in Table~\ref{table:env}.


\begin{table}[!htbp]
\caption{Development environments and requirements.}\label{table:env}
\centering
\begin{tabular}{ll}
\hline
System       & CentOS 7.6\\
\hline
CPU   & Intel Xeon SKL G-6148 CPU@2.4GHz \\
\hline
RAM                         &384 GB\\
\hline
GPU (number and type)                         & NVIDIA V100\\
\hline
CUDA version                  & 11.0\\                          \hline
Programming language                 & Python 3.20\\ 
\hline
Deep learning framework & torch 2.0, torchvision 0.2.2 \\
\hline
\end{tabular}
\end{table}


% \subsubsection{Training protocols}
% Please describe at least the following aspects:


% 1. Data augmentation 

% 2. data sampling strategy

% 3. optimal model selection criteria
\subsection{Semi-Supervised Pseudo-Labeling Scheme}
A core component of the methodology is the use of a pseudo-labeling technique to leverage the large amount of unlabeled data. An initial nnU-Net model is trained for 750 epochs on the 30 available labeled scans. This trained model is then used to generate pseudo-labels for the 300 unlabeled scans. The pseudo-labeled data is then used to retrain the model for an extended period of 500 epochs. This approach effectively expands the training set, allowing the model to learn from a much larger volume of data than the labeled set alone. Extensive training improves the model's ability to accurately segment variable pulp canal morphology.

\begin{table*}[!htbp]
\caption{Training protocols of the nnUNet model trained with 30 labeled scans}
\label{table:training}
\begin{center}
\begin{tabular}{ll} 
\hline
Pre-trained Model         & None (training from scratch) \\
\hline
Batch size                    & 2 \\
\hline 
Patch size & 128$\times$128$\times$128 \\
\hline
Total epochs & 750 \\
\hline
Optimizer          & SGD with Nesterov momentum (0.99) \\ \hline
Initial learning rate (lr)  & 0.01 \\ \hline
Lr decay schedule & Polynomial decay ($lr = lr_{0}(1 - \frac{epoch}{max\_epoch})^{0.9}$) \\
\hline
Training time                                           & $\sim$12 hours\\  \hline 
Loss function & Cross-entropy + Dice loss (sum) \\     \hline
Number of model parameters    & $\sim$30--35M\footnote{https://github.com/sksq96/pytorch-summary} \\ \hline
Number of flops & $\sim$250--300G\footnote{https://github.com/facebookresearch/fvcore} \\ \hline
\end{tabular}
\end{center}
\end{table*}


\begin{table*}[!htbp]
\caption{Training protocols of the nnUNet model trained with 300 pseudo labeled scans }
\label{table:training2nd}
\begin{center}
\begin{tabular}{ll} 
\hline
Pre-trained Model         & None (training from scratch) \\
\hline
Batch size                    & 2 \\
\hline 
Patch size & 128$\times$128$\times$128 \\
\hline
Total epochs & 500 \\
\hline
Optimizer          & SGD with Nesterov momentum (0.99) \\ \hline
Initial learning rate (lr)  & 0.01 \\ \hline
Lr decay schedule & Polynomial decay ($lr = lr_{0}(1 - \frac{epoch}{max\_epoch})^{0.9}$) \\
\hline
Training time                                           & $\sim$36 hours \\  \hline 
Loss function & Cross-entropy + Dice loss (sum) \\     \hline
Number of model parameters    & $\sim$30--35M\footnote{https://github.com/sksq96/pytorch-summary} \\ \hline
Number of flops & $\sim$250--300G\footnote{https://github.com/facebookresearch/fvcore} \\ \hline
\end{tabular}
\end{center}
\end{table*}



\section{Results and discussion}
The proposed methodology demonstrates robust performance in both teeth and pulp segmentation, successfully addressing the challenges of a limited labeled dataset. The semi-supervised approach, which combines pseudo-labeling with the nnU-Net framework, proved to be scalable to large CBCT datasets.
% Note: Please describe at least the following aspects in this section


% In what kind of cases the proposed method works well?

% What are the possible reasons for the failed cases?
Figure \ref{fig:curves} and Table \ref{table:training} detail the training curves and protocols, respectively, for the initial model utilizing the limited dataset of 30 labeled images. Another nnUNet model trained on the 300 pseudo-labeled images is illustrated by the training curves and protocols presented in Figure \ref{fig:curves1} and Table \ref{table:training2nd}. These figures provide a comprehensive overview of the training progression and the impact of the semi-supervised approach on the model's final performance.


\begin{figure}
    \centering
    \includegraphics[width=1\linewidth]{imgs/progress_old.png}
    \caption{Training and validation curves from nnUNet with 30 labeled scans}
    \label{fig:curves}
\end{figure}

\begin{figure}
    \centering
    \includegraphics[width=1\linewidth]{imgs/progress.png}
    \caption{Training and validation curves from nnUNet with 300 pseudo labeled scans}
    \label{fig:curves1}
\end{figure}

\begin{table}[ht!]
\centering
\small
\caption{Mean Dice and IoU Scores for Valid Dental Anatomy Labels.}
\begin{tabular}{l l c c}
\toprule
\textbf{Label ID} & \textbf{Anatomical Structure} & \textbf{Mean Dice} & \textbf{Mean IoU} \\
\midrule
1  & Dental Hard Tissues        & 0.9591 & 0.9215 \\
2  & Pulp Chamber               & 0.8298 & 0.7100 \\
4  & Palatal Root               & 0.7449 & 0.5971 \\
5  & Mesial Root Canal          & 0.4976 & 0.3730 \\
6  & Distal Root Canal          & 0.6700 & 0.5116 \\
7  & Mesiobuccal Root Canal     & 0.6986 & 0.5390 \\
8  & Mesiolingual Root Canal    & 0.5978 & 0.4366 \\
9  & Distobuccal Root Canal     & 0.7150 & 0.5583 \\
12 & Impacted Tooth             & 0.9565 & 0.9173 \\
\bottomrule
\end{tabular}
\label{tab:mean_metrics_filtered}
\end{table}

\begin{table}[ht!]
\centering
\scriptsize
\caption{Per-case Dice scores for all non-background labels.}
\begin{tabular}{lccccccccc}
\toprule
\textbf{Case} 
& \textbf{D1} 
& \textbf{D2} 
& \textbf{D4} 
& \textbf{D5} 
& \textbf{D6} 
& \textbf{D7} 
& \textbf{D8} 
& \textbf{D9} 
& \textbf{D12} \\
\midrule
ToothPulp\_004 & 0.964 & 0.831 & 0.695 & 0.675 & 0.742 & 0.671 & 0.662 & 0.691 & 0.966 \\
ToothPulp\_009 & 0.955 & 0.872 & 0.848 & 0.248 & 0.776 & 0.780 & 0.704 & 0.808 & 0.961 \\
ToothPulp\_013 & 0.961 & 0.852 & 0.751 & 0.791 & 0.475 & 0.746 & 0.658 & 0.668 & 0.977 \\
ToothPulp\_016 & 0.953 & 0.797 & 0.756 & 0.000 & 0.679 & 0.686 & 0.337 & 0.706 & 0.923 \\
ToothPulp\_019 & 0.968 & 0.830 & 0.758 & 0.619 & 0.718 & 0.662 & 0.670 & 0.713 & 0.977 \\
ToothPulp\_027 & 0.952 & 0.798 & 0.660 & 0.653 & 0.630 & 0.647 & 0.555 & 0.704 & 0.935 \\
\bottomrule
\end{tabular}
\label{tab:percase_dice}
\end{table}






% Note to Table~\ref{tab:results-coreset}: if you have multiple solutions, such as ablation studies, you can use a similar Table format to report the performance on the public/online validation set.

\subsection{Quantitative results on validation set}
% Please describe the results
Across the six validation cases, the model shows a clear separation in performance between large dental structures and the smaller, more complex root canal pathways. As shown in Table~\ref{tab:percase_dice}, the model achieves consistently high Dice scores for large anatomical structures (labels 1 and 12), while the fine-grained canal structures exhibit substantial variability. Table~\ref{tab:mean_metrics_filtered} further highlights this contrast, demonstrating strong segmentation performance for Dental Hard Tissues and the Impacted Tooth compared to the smaller canal structures. Dice scores for these major structures consistently fall within the 0.92–0.97 range, whereas the root canals show markedly lower and more unstable performance, ranging from moderate scores (approximately 0.79–0.81) to complete failures in the most challenging cases.

\subsection{Qualitative results on validation set}
Qualitative analysis of the validation set indicates that the model's performance, which achieved an overall accuracy of 80.88\%, is highly dependent on anatomical characteristics. The model consistently demonstrates robust segmentation of large, well-defined structures but exhibits limitations in regions with low image contrast and complex micro-anatomy as noted in the varous sections in Figures \ref{fig:eg1} and \ref{fig:eg2}.The visualization of the segmentations was done using 3D Slicer. \cite{fedorov20123d}
A successful segmentation is shown in the top row of the Figure \ref{fig:eg3} \- STS25\_Validation\_0007 where the model accurately delineates the main bodies of the dental hard tissues and the pulp chambers. The resulting boundaries are clear, and the 3D reconstruction is anatomically cohesive, reflecting the model's strength in identifying structures with distinct intensity gradients.

STS25\_Validation\_0025 is highlighted in the bottom row of the Figure \ref{fig:eg3} as it  exemplifies the model's primary weakness: the inability to completely segment the apical third of the tooth roots. The segmentation is visibly incomplete, resulting in fragmented masks and a disjointed 3D reconstruction where the teeth appear to lack proper root structures.

The primary reason for these failures is the low contrast-to-noise ratio (CNR) between the root apex and the surrounding trabecular bone. This ambiguity is exacerbated by partial volume averaging artifacts, which are common in thin structures, and the complex anatomy of the root tip, which often includes lateral canals. Consequently, while the model reliably segments the bulk of the tooth structures, its accuracy is diminished by its inability to resolve these challenging but clinically significant apical regions.


\begin{figure}[h]
    \centering
    \includegraphics[width=1\linewidth]{imgs/001.png}
    \caption{Segmentation result visualization on validation case 001}
    \label{fig:eg1}
\end{figure}

\begin{figure}[h]
    \centering
    \includegraphics[width=1\linewidth]{imgs/003.png}
    \caption{Segmentation result visualization on validation case 003}
    \label{fig:eg2}
\end{figure}

\begin{figure}[h]
    \centering
    \includegraphics[width=1\linewidth]{imgs/3d slicer07_25.png}
    \caption{Segmentation result visualization using 3D Slicer for cases 007 and 025}
    \label{fig:eg3}
\end{figure}

% \subsection{Results on final testing set}
% This is a placeholder. No need to show testing results now. 
% We will announce the testing results during CVPR (6.11) then you can add them during the revision phase. 


% \subsection{Limitation and future work}




\section{Conclusion}
The proposed methodology overcame the challenges of teeth and pulp root canal segmentation in 3D CBCT scans, particularly the scarcity of labeled data. The semi-supervised approach, which combines pseudo-labeling with the nnU-Net framework. By generating pseudo-labels for the large-scale unlabeled dataset, the model was able to learn from a much larger data pool, improving its generalization capabilities beyond what would be possible with the limited labeled data alone. The use of nnU-Net was also crucial in handling the high variability in pulp canal morphology and ensuring fine-grained annotation consistency. The achieved validation dice score of 0.8088 on the validation set demonstrates the effectiveness of this approach.

\section{Limitations and Future Work}
The proposed method, while effective, still faces challenges in segmenting the apical third of the roots due to low contrast, complex anatomy, and partial-volume effects, which lead to incomplete or fragmented predictions. The reliance on pseudo-labels also introduces potential noise that may propagate during training, limiting consistency in difficult regions. Future work can focus on improving thin-structure representation through super-resolution or topology-aware modules, incorporating uncertainty-guided pseudo-label refinement, and expanding the diversity of labeled data through active learning.


\subsubsection{Acknowledgements} We thank all the data owners for making the medical images publicly available and Codabench~\cite{codabench} for hosting the challenge platform.

\subsubsection{Disclosure of Interests.} The authors have no competing interests to declare that are relevant to the content of this article. \\
% Or: Author A has received research grants from Company XXX.
%
% ---- Bibliography ----
%
% BibTeX users should specify bibliography style 'splncs04'.
% References will then be sorted and formatted in the correct style.
%
\bibliographystyle{splncs04}
\bibliography{ref}

% \newpage
% % Please add the following required packages to your document preamble:
% % \usepackage[normalem]{ulem}
% % \useunder{\uline}{\ul}{}
% \begin{table}[!htbp]
% \caption{Checklist Table. Please fill out this checklist table in the answer column. (\textbf{Delete this Table in the camera-ready submission})}
% \centering
% \begin{tabular}{ll}
% \hline
% Requirements                                                                                                                    & Answer        \\ \hline
% A meaningful title                                                                                                              & Yes        \\ \hline
% The number of authors ($\leq$6)                                                                                                             & 2       \\ \hline
% Author affiliations and ORCID                                                                                           & Yes       \\ \hline
% Corresponding author email is presented                                                                                                  & Yes        \\ \hline
% Validation scores are presented in the abstract                                                                                 & Yes       \\ \hline
% \begin{tabular}[c]{@{}l@{}}Introduction includes at least three parts: \\ background, related work, and motivation\end{tabular} & Yes       \\ \hline
% A pipeline/network figure is provided                                                                                           & \ref{fig:overview} \\ \hline
% Pre-processing                                                                                                                  &  \ref{preprocess}  \\ \hline
% Strategies to data augmentation                                                                                             & --   \\ \hline
% Post-processing                                                                                                                 & --   \\ \hline
% Environment setting table is provided                                                                                           & \ref{table:env}  \\ \hline
% Training protocol table is provided                                                                                             & \ref{table:training}, \ref{table:training2nd} \\ \hline
% Ablation study                                                                                                                  & --   \\ \hline
% Visualized segmentation example is provided                                                                                     & \ref{fig:eg1}, \ref{fig:eg2}, \ref{fig:eg3} \\ \hline
% Limitation and future work are presented                                                                                        & Yes        \\ \hline
% Reference format is consistent.  & Yes      \\ \hline
% Main text <= 8 pages (not include references and appendix)  & Yes    \\ \hline

% \end{tabular}
% \end{table}




\end{document}
