FST-Net: Facial Soft Tissue Landmark Localization on 3dMD Scans Using Feature Fusion and Local Coordinate Regression
Abstract: Landmark localization of facial soft tissue (FST) is a basic step in 3D morphometric analysis of human face. However, there are few studies on landmark localization of 3D scan images based on deep learning. The methods based on 2D images cannot be directly applied due to the non-Euclidean data structure. In this paper, we propose an end-to-end learning framework to automatically localize 28 landmarks on 3dMD scans, called FST-Net. Our method extracts features from texture images and mesh models. New texture mappings for 3dMD scans are generated by projection to fuse texture and structure features. A dual-branch network integrating transformers is applied to predict the landmark heatmaps from coarse to fine. A local coordinate regression module based on probabilistic distance and heatmap predictions is proposed to compute the landmark coordinates. We collect and annotate 297 3dMD face scans from the clinic to evaluate our model. Experiments show that the average localization error of the model is 1.204mm (the clinically acceptable precision range is 1.5 mm), and the correct landmark detection rate equals to 70.89%. Our model outperforms the current state-of-the-art deep learning methods for landmark localization on the mesh model.
Loading