Morphology-Driven Deep Watershed Transform for 3D Tooth Segmentation

Published: 04 Dec 2025, Last Modified: 07 Jan 2026ODIN2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: CBCT segmentation, ToothFairy3 Challenge, Morphological inductive bias, Deep Watershed
TL;DR: We propose a Deep Watershed–based approach for instance-aware multi-class segmentation of dentomaxillofacial structures in CBCT, achieving accurate delineation of complex anatomy in the ToothFairy3 challenge.
Abstract: Segmentation of dentomaxillofacial structures in Cone-Beam Computed Tomography (CBCT) remains challenging, particularly for fine details such as root apices and nerve canals, which are crucial for evaluating root resorption in digital dentistry or to make surgical planning more precise. We present an approach that unifies instance detection and multi-class dentomaxillofacial structure segmentation in CBCT scans, in the scope of the ToothFairy3 Challenge. We adapt a Deep Watershed method, modeling each anatomical structure as a continuous 3D energy basin encoding voxel distances to class boundaries. This instance-aware representation ensures accurate segmentation of narrow, complex dentomaxillofacial structures. We train and evaluate our solution on the ToothFairy3 dataset, comprising 532 CBCT scans with voxel-wise annotations. Our method achieved a mean Dice coefficient of 0.742 and HD95 of 111.13 on the test set. We provide implementation at https://github.com/tomek1911/TF3.
Changes Summary: Response to Reviewers - Post-Decision Revision: We thank the reviewers for highlighting the value of the geometric priors, the memory-efficient inference, and the clarity of our methodology. **Suboptimal performance compared to prior work** The performance gap may stem from omitting the SSM prior and using discretized direction fields. Importantly, GEPAR3D was originally developed for teeth only, not the full dentomaxillofacial structures addressed in this challenge. To adapt the method, we extended it to handle the complete anatomy, and we have now corrected the discretization issue in the released code. Internal validation with these updates shows consistent improvements, and the updated model will be submitted to the GrandChallenge test server during the post-challenge evaluation window, with test-set results reported once available. **Direction loss weighting** The very small coefficient originated from lack of proper voxel-count normalization, which made the effective loss magnitude much larger. We thank the reviewer for pointing it out, we corrected this formulation and implementation; the direction loss now operates on the comparable scale as the multi-class segmentation loss and contributes reliably. **Decoupled pulp modeling** We agree that a multi-label formulation in pulp area could increase anatomical coherence. In the current setup, the pulp head still uses shared encoder and decoder features (except the final layer), which aids to preserve contextual coupling. Discontinuities in some pulp training labels, especially in narrow canals, complicated joint training, and further experiments are required. **Comparative analysis** The original GEPAR3D work already compared the deep-watershed design with a traditional watershed method (MWTNet) and reported clear improvements. Given the difficulty of structures such as the IAN canal, we expect this advantage to hold. **Dentomaxillofacial ablation study** The architecture is derived from GEPAR3D, where ablation experiments already showed the benefit of the deep watershed formulation, energy regression, and directional supervision for adult dentition segmentation. In this work, the focus was on adapting the method to full dentomaxillofacial anatomy under challenge constraints. We agree that anatomy-specific ablations would be informative. We will focus on designing more detailed experiments in future work. **Energy Direction use, undefined loss at 0** L_dir supervision is used only during training and is an explicit term in the loss function (Eq. 1). Direction loss is computed only on voxels belonging to class instances (we fixed typo mask P to N), avoiding undefined angles at zero distance. **EDT computation** A single EDT is computed per each semantic class separately, based on ground-truth labels, where each class is represented as one instance. It results in single volume. We clarified this in the final paper. **Minor issues** We corrected Eq.1. We added link to the source code.
Latex Source Code: zip
Main Tex File: odin_challange_tf3.tex
Confirm Latex Only: true
Code Url: https://github.com/tomek1911/TF3
Authors Changed: false
Copyright: pdf
Submission Number: 26
Loading